Sunlight Foundation

OGD: Commerce repackages old data and offers broken links

To comply with the Open Government Directive, the Commerce Department released four high value datasets that require considerable technical sophistication on the part of users--and patience. Some of the files are so large and cumbersome they're very difficult to open and use;  others require a great deal of explanation--and you can currently only find those explanations by digging through the agency's site. Still other entries feature broken links or only contain a fraction of the information described on Data.gov. The Commerce Department says they're working on all of these problems, so hopefully we'll see an improvement in the coming days.

Consider the broadband applications database. The Recovery Act provided $7.2 billion in grants and loans to extend broadband Internet access to underserved communities. The Commerce Department has released a spreadsheet of the applications they have received for those funds, including the names of the organizations, contact information and amounts requested. This is potentially useful information; one can easily see, for example, that a few states submitted a large portion of the applications--the top six submitted as many applications as the bottom forty. 

However, the spreadsheet was already out of date when it was posted on Data.gov. The Commerce Department awarded four large grants before Jan. 21, 2010, the deadline for releasing high value datasets, but were not included in the spreadsheet. Furthermore, the data isn't new--it was released on September 9--although it is much more user-friendly in its current form to reporters who want to analyze the bulk data (previously, it was posted as a PDF). 

What's more, the data has been searchable and up-to-date on the National Telecommunication and Information Administration Web site (here) since last fall.While its helpful to researchers to have it in Excel form, it's not exactly new.

Another potentially useful database put out by Commerce is the National Technological Information Service database. The dataset lists the titles, categories, and sponsoring agencies of government-funded R&D studies, and includes links that allow you to buy a copy of each report.  Unfortunately, the description on data.gov is currently inaccurate; it says that the data is available electronically from 1964.  The attached XML file only goes back to 2005, and for anything earlier than that, parts of it are only available through third-party paid services. 

It would also be nice to have a clearer description of what some of the codes mean in the document.  What's a category code 97P 97R 57K 57B?  And the file is so large it required a programmer here at Sunlight to convert it into a form that a commonly-used program like Excel could handle--after crashing browsers on several computers all morning.

The third dataset available is a list of fees paid by holders of U.S. patents. Previously, you had to get a data expert to "scrape" the U.S. Patent Office site if you wanted to get your hands on this data--and it was popular enough that the Patent Office ended up putting out requests for users to stop doing this, because it was using up too much of their bandwidth. It's promising that this is now available to download -- but most of the links to the data on data.gov were broken for the better part of a week, so users had to dig around on the Patent Office site to find out what the columns of numbers and letters mean. And that was no easy task. (We made sure to let them know the links weren't working.)

Finally, the Commerce Department is offering a map of precipitation data gathered by volunteers around the country, in a format ready to import into Google Maps and Google Earth. Maps of this data were available online previously, and a member of the data collection team tells me that they've offered a data export function all along -- and that he believes some agencies had already imported them into Google Earth.

So it looks like all Commerce has done is eliminate a couple of steps and put it up on data.gov. Handy, but not exactly a brand new revelation.

Search the Blog

Popular tags

2012 election 2012 elections 2013 Inauguration Ad Ad Hawk Ad Hoc AIG american crossroads Arab Spring Barack Obama BP budget Campaign contributions Campaign Finance Center for Responsive Politics Citizens United consumer banking Contracting Conventions2012 Correspondence crossroads GPS dark money Data Mine datamine debt ceiling Disclose act Distributed Research Dodd-Frank Earmarks Election 2012 Elizabeth Warren FARA FCC FDA FEC Federal Election Commission Finance Data Catalog Financial Bailout Financial Reform FLIT FOIA follow the unlimited money Foreign lobbying Foreign Lobbying Influence Tracker freshmen Fundraising Guns Handy Tools health care Hoc House House Freshmen 112th House Majority PAC Immigration Independent Expenditure Independent expenditures influence Influence Explorer investment James Bopp Jr. Lobbying lobbying tracker Logs_6553 Majority PAC Mark Sanford Market Meltdown Media Medicare meeting logs Mitt Romney National Rifle Association Newt Gingrich NRA obama OGD Open Government Directive Orrin Hatch outside spending Party Time PMA Group political ad sleuth Political Party Time Politwoops President Obama Priorities USA Action Recovery Recovery.gov Rep. John Murtha Research Restore Our Future revolving door Rick Perry Rick Santorum Romney Ron Paul Sen. Christopher Dodd Senate Sheldon Adelson states of transparency Stealthy Wealthy stimulus Sunlight Live super committee super congress Super PAC super PAC profile Super PACs supercommittee Supercongress supreme court TARP Taxpayers for Common Sense transparency