A set of scripts to easily download and import US legislative data into a Neo4j database. This is a work-in-progress, please submit an issue for any errors or feature requests.
The data model incorporates a small amount of the data available from GovTrack. Please submit an issue to request any changes / updates. We're really interested in how the commmunity might want to use this data so please let us know!
Also, this file has more detailed information about the data model.
This Cypher script will load data from the 114th Congress. You can use the LazyWebCypher tool with this link.
We're currently working to streamline the data loading process, but for now you can follow these steps to load data.
pip3 install -r requirements.txt
Sync a particular congress by its number (so, for instance, for the 112th
congress, replace <num>
with 112
.
./sync.sh <num>
Use the parse scripts to parse the raw data into CSV files that can be easily loaded into Neo4j.
$ python3 parse_legislators.py
...
$ python3 parse_bills.py
...
$ python3 parse_votes.py
...
$ python3 parse_committees.py
...
$ python3 parse_committee_members.py
...
The scripts require Python 3.
See the steps documented here for configuring neo4j-shell
and pointing Neo4j to the CSV files generated in the previous step.
$ path/to/neo4j/bin/neo4j-shell < import.cypher
Once the data is loaded in Neo4j we can use queries written in Cypher to discover interesting things about Congress.
Find all Legislators:
MATCH (n:Legislator) RETURN n LIMIT 100
Find Steve Daines:
MATCH (n:Legislator {firstName: "Steve", lastName: "Daines"}) RETURN n
What Bills did Steve Daines sponsor?
MATCH (n:Legislator {firstName: "Steve", lastName: "Daines"})<-[:SPONSORED_BY]-(b:Bill) RETURN b
For how many Bills did Steve Daines vote Yea?
MATCH (n:Legislator {firstName: "Steve", lastName: "Daines"})-[v:VOTED_ON]->(b:Bill)
WHERE v.vote = "Yea"
RETURN b
Find the number of bills proposed during each congress in the database.
MATCH (c:Congress)<-[:PROPOSED_DURING]-(b:Bill)
RETURN c.number AS congress, count(b) as numProposed
Find the number of bills enacted in each congress in the database and the average number of sponsors bills had during that congress.
MATCH (c:Congress)<-[:PROPOSED_DURING]-(b:Bill)-[:SPONSORED_BY]->(l:Legislator)
WHERE b.enacted = 'True'
WITH c, b, count(l) AS numSponsors
RETURN c.number AS congress, count(b) AS numPassed, avg(numSponsors) AS avgSponsors
- Introducing legis-graph - US Congressional Data With Govtrack and Neo4j
- Congressional PageRank - Analyzing US Congress With Neo4j and Apache Spark
The software in this repository is provided "AS-IS" without warranties or guarantees of any kind. Data used by this software is provided by Govtrack.us and should be used under the terms specified by Govtrack.us here.