Data.gov is already broken — just like everything before it
February 7, 2010 9 Comments
Like most people (I assume), I was doing a little GIS project SuperBowl morning. Needing some data, the first place I thought of going what the new [Data.gov] site to download some data. After doing a quick and simple search, I got the [dataset I wanted](http://www.data.gov/details/12) ready to download. But as with every government data repository before it, it is broken. Posted datasets download links are many times 404:

It just isn’t the download, but the [metadata](http://www.fws.gov/data/migflyway.html) as well. I know, some datasets still work and who knows, maybe this one will again one day. But for [Data.gov] to be valuable it needs to ping the data sources to let the users know that they are down (and for web services [what percentage](http://registry.fgdc.gov/statuschecker/wmsResultsReport.php?catalog=gos) they are down). Also it wouldn’t hurt to let the owner of the data know that their datasets are no longer linked correctly in the Data.gov website. Otherwise we’ll just get [link rot](http://en.wikipedia.org/wiki/Link_rot) and that can kill a project.
If projects are going to be built on data discovered with Data.gov, much more has to be done to ensure that this data is available consistently, not when people get around to updating broken links. If things don’t change it is another waste of taxpayer money and we’d just have been better off sticking with the [previous government data boondoggle](http://gos2.geodata.gov/wps/portal/gos).
[1]: http://en.wikipedia.org/wiki/Link_rot “Link Rot”

Link rot is like foot fungus. Potential is always there, sure, but the cause is poor hygiene. (@sgillies).
Last spring, agencies threw whatever was handy onto Data.gov to answer the mail from the newbie politicals. They hated doing it, that’s why most of the content was/is marginally interesting.
You know what comes next?
“It’s not my job”
Sigh.
BTW – let’s ask OMB how much money has gone into GOS in the past years, and what the hit rate is on that dreadful cobweb. Might be fun to file a FOIA request to the USGS for hit statistics, normalize out the bots and see what’s left.
Betcha it’s tens of hits from ‘real’ users every month and tens of thousands of hits from Google’s bots looking to scrape content.
wow – not getting any action james? – one broken link and you pronounce the end of data.gov – the cloud – soa and the whole internet thing… how about just sending an email letting them know there is a broken link.
Huh.
Wow, apparently you’re not an active user of either data.gov or GOS. Have you ever used either? You also appear not to have understood the post…the point is, Wow, that the site should have been set up so folks in the user community [don't] have to send emails to prod the admin to keep it running right.
Data.gov is just another lipstick-slathered, porcine PR blast that’s another DC-irrelevancy-in-the-making. Honestly, expecting a government agency to post information that would make them accountable and be shared with others…
Methinks thou doth protest too much, Wow.
Now go fix those links.
I was working on a project last month and ran into a couple broken links (csv data, not shp). I can’t find them in my browser cache so I’ll see if I can find them when I get into work tomorrow.
I hadn’t heard of the term link rot before, but I think this is going to be a huge problem with Data.gov moving forward.
Federal agencies need coherent policies, and a consistent means of publishing, organizing, accessing and cataloging data and metadata. These efforts, GOS and Data.gov are a work in progress, presumably to be combined at some point and further improved.
With GOS, the service status checkers are supposed to ping the assets to check them. That capability can and should be bolstered – along with better ways to manage data and metadata lifecycle.
Did you post a suggestion to have data.gov periodically check the status and availability of posted assets to the ideascale site that was set up to solicit comments and suggestions? → http://datagov.ideascale.com/
Dave, other than your link to that ideascale.com site, how would an average user know to go there to give feedback?
I knew about the IdeaScale from working as a federal contractor involved in agencies, with feds who are working on getting their data submitted to Data.gov – I’d agree, it probably wasn’t as broadly communicated as it should be, it’s likely just the metadata wonks like me who have been tracking it – though the feds have used various IdeaScale sites for soliciting various comments and ideas for a year or so now. The IdeaScale site should be posted on their main Data.gov site and communicated via other means as well. That’s another suggestion that needs to be posted…
In checking the Data.gov site, there is in fact a link to the ideascale site – on the right hand of the Data.gov page, it says:
Clicking that link, in turn, takes you to the data.gov ideascale site.