Last week, the Senate took a small but significant step into the twenty-first century when it started providing voting data publicly via XML.
Short for eXtensible Mark-up Language, the format is a good method for releasing data because it makes it simple for clever programmers to manipulate and display the information in a variety of ways. Each data field in XML—in this instance, think senators’ names, bill titles, corresponding yays and nays, etc.—can be reprocessed, reorganized, and more easily cross-referenced with other information like campaign donations or district spending.
Without XML, anyone seeking to digitally collect certain kinds of Senate voting information had to do a somewhat complicated work-around, and find a way to extract the data from unfriendly formats.
“Me and The Washington Post and The New York Times have already figured out ways to get this information by screen scraping,” says Josh Tauberer, a University of Pennsylvania linguistics graduate student who runs Govtrack.us, a site that tracks congressional voting records and the status of bills. “For projects that haven’t started yet, this helps a bit.”
Tech-focused open-government advocates have long encouraged the 100-member club to catch up to the House, which has been releasing vote data by XML since 2003. But the Senate wasn’t so eager to comply, even though it was already compiling the information in XML format for internal use.
“It’s just sort of typical Senate stuff,” says Politico reporter Victoria McGrane. “The House does it, and the Senate has an archaic reason for why they won’t.”
But when McGrane set out to write a piece on the Senate’s missing XML, she had a bit of trouble discerning exactly what that archaic reason was. The Secretary of the Senate’s office, which oversees the recording of votes, declined to comment and referred McGrane to the Rules Committee. And then Rules Committee staff said there was no one she could talk to. So McGrane quoted John Wonderlich, policy director at the Sunlight Foundation (which supports CJR’s reporting on transparency) recounting what he’d been told:
“The Secretary of the Senate has cited a general standing policy … that they’re not supposed to present votes in a comparative format, that senators have the right to present their votes however they want to,” Wonderlich said. “It’s pretty bad.”
“The nice thing about writing about this is that the Senate doesn’t have a good argument against doing it, so you can sort of have fun,” says McGrane. “I wanted to do the best job I could explaining what it was without getting too bogged into the technical details.”
McGrane’s piece was published on April 27. Soon after, South Carolina Senator Jim DeMint’s office, which has a record of involvement in transparency issues, sent her story to other senators’ offices, urging staff members to get their bosses to come out in favor of the making the data stream public.
“Staffers shared it by e-mail and said ‘Take a look at this. This explains what we’re talking about,’” says Wesley Denton, DeMint’s communications director.
Six senators agreed to add their names to a “dear colleague” letter from DeMint to Chuck Schumer, the chairman of Rules Committee, and Robert Bennett, the ranking Republican, asking the pair to lift any policy hampering the release voting data via XML. The April 30 letter closed by noting that the House has been doing so for five years “with no adverse effects.”
“Her reporting highlighting why such a small technical change made a difference in the cause of transparency opened up the eyes of senators and staffers who had been ignoring the issue for years,” says Denton.
DeMint’s office included the letter in a a May 1 press release that also excerpted McGrane’s article. And that very day Schumer instructed the Secretary of the Senate to authorize using XML.
That led to another press release from DeMint’s office on May 5, one that thanked the Rules Committee for its “quick response.”
An interesting wrinkle went unmentioned in DeMint’s press release, and in McGrane’s original story (she says she wasn’t able to learn about it since both the Rules Committee and the Secretary of the Senate’s office declined comment): according to a subsequent story by McGrane, it turns out that the Secretary had, on April 20, already written to Schumer asking for permission to start the feeds. (Schumer’s office did not return phone calls requesting comment on the public adoption of XML.)
Whatever the motivations, the XML voting data is now available. And while that’s a big improvement—and a sign that some in the Senate are taking more seriously the effort to make legislative data more readily available—the Senate and the House still don’t provide XML data on the status of bills. Programmers who want to display information on a bill’s standing outside of a roll-call vote—where it is in committee, for instance—must devise their own data-scraping programs, or use an open-source third-party XML feed, like the one that GovTrack.us makes available for free.
“The endgame is for them to have the same database that I have,” says Tauberer. “I want to put myself out of business.”