Trying to unlock transit data
August 24, 2009 17 Comments
Something that has been bothering me somewhat these last few months is transit data and how we as a community can have access to it. Here in Phoenix, we are unlucky enough not to have access to Google Transit, Valley Metro is locked in to the 1990′s with their system and you should see how bad the mobile version is. So many of us here have been trying to free the Valley Metro data to get it incorporated into OpenStreetMap or even Google Transit. Since I’m not blogging about how successful that effort has been, you can guess that we are stuck back in the last decade still.
But that isn’t the whole story right? Just getting transit data into Google Transit is great for users of Google services, but not good for the community at large. Google has been really good at getting data locked up behind government bureaucracy, but they’ve done nothing to help free this data beyond using their own APIs. But at the same time, is this Google’s responsibility? Google seems to say they try to get people to share it, but it isn’t their job. I think I tend to agree with them. If an organization is willing to share data with Google, why aren’t they willing to share it with the public? That is where these organizations fall down and it would appear where public transit is a huge problem.
Public transit organizations are looking at protecting their data from being used by others. It isn’t everyone, but it does send a message that if you as a member of the riding public want to access public data, you need to do so under an application approved by that transit organization. I find it amazing that DMCA can be applied to public data and it should send chills to anyone who wants to use data they as a taxpayer are funding. Google is able to negotiate deals because they have the money and the eyeballs that transit organizations want. I just don’t like the path that we seem headed down where if I want to access public data that I helped create… I need to do so using a commercial API and possibly have to pay a vendor (paying twice is a tax on a tax) the right to use what should be free.

Keep away from that slope, you won't like where it will lead!

Exactly-odo, quasimodo.
The real story underneath this story is data should be made available (as in DC or SF) as open content.
Publishing something to Google is about as un-open as publishing it in *.e00 format and expecting folks to have ArcINFO to decode it.
The only difference is Google’s front end is more approachable.
Hard to say that publicly without seeming like a malcontent and un-kewl because Google is so fashionable, but it is what it is.
With so many transit systems using vehicle tracking, they should be publishing real time feeds too. I should be able to buy a ticket using SMS and get an alert when the bus is near the stop. I think they’ve been doing this in Finland for several years now.
I am assuming you guys have tried reasoning with Valley Metro about the benefits to everyone of having transit data available to developers. You may want to point them towards Portland’s TriMet, who have gone above and beyond what even the biggest geo-nerd would hope for in terms of making data available.
You also may want to consider putting together a TransitCamp. We had one here in Vancouver a couple years ago, and it got the ball rolling towards improving the situation. Our data is still private, but now the key is left in the door ( http://www.mweisman.com/transit.html) for developers to use it.
Insightful post, James. For all the talk about opening up public access to government data, there are still major throwbacks to old ways of thinking. I hear about and deal with them every day.
And I’m glad people are pointing out that making data available to Google is hardly equivalent to public access. It’s certainly great that Google has developed Maps and Transit, etc. but that’s not something for agencies to hide behind when asked to make their info truly public.
And before reading your post, I hadn’t thought of the slippery slope angle (ah, the power of a simple, catchy graphic!). Now it’s transit data, but what if it also becomes GIS data and basically anything else. It’s been hard enough to move local agencies toward more open access, it’d be a shame if they started to say “We’ve provided the data to Google, just get it from them.”
“We’ve provided the data to Microsoft, just get it from them.”
“We’ve provided the data to Oracle, just get it from them.”
“We’ve provided the data to Mapquest, just get it from them.”
“We’ve provided the data to ESRI, just get it from them.”
“We’ve provided the data as XML records with an open schema, just get it from our server.”
I know which one I want to see…
“We’ve provided the data as XML records with an open schema, just get it from our server.”
Unfortunately around here, the moment we open up a server, it gets attacked. We have to captcha everything or it will get taken down within hours. Some people just don’t like government.
I’d love to hear the official explanation of the logic behind not making transit data truly open. Assuming you’re talking about things like bus routes, timetables, usage etc, this is all stuff that they have to make public anyway or the whole “public transport” concept kind of falls down. “Yeah, we run buses and trains, but we’re not telling you where to, or when…”. Actually, on second thoughts, that sounds pretty much like how things work here in the UK, but that’s a whole other story…
The MBTA (Boston) just published their data hoping that someone will develop an “app”
http://www.mbta.com/about_the_mbta/news_events/?id=17997&month=&year=
Wasn’t there a law suit some time ago in California where a City was keeping and charging for data the tax payers paid for dealing with property data? Seems to me the court ruled that since the data was gathered at the tax payers expense that the data had to be free of charge with the exception of a reasonable fee in order to get the data from one system/format to another by the city.
If the data in this case if the systems that collect the data and the storage is paid for by the tax payer, could it not fall under the same ruling?
It was actually Santa Clara County and here is the link:
http://www.opendataconsortium.org/newsbdy.htm
Pingback: James Fee GIS Blog » Blog Archive » Wait! Sensor webs do exist in the real world thanks to Google Maps Mobile
The EOT in Massachusetts actually approached the hacker community here to find out what developers would need to build apps around transit data (check their EOT Developers Page Beta for more info). It’s a very active discussion and process, with feedback and response from both sides. Makes me look forward to the outcome and eventually developed apps on top of transit data.
Most states have had regs on the books for a long time that allow municipalities to recover a nominal fee for the creation a/o distribution of data, maps, etc.
Has anyone actually legally challenged the idea that publicly funded data is covered by the DMCA? As much as the DMCA is abused these days I’d be surprised if that actually passed muster in court.
That is exactly what I posted about. Originally Santa Clara wanted to charge some large sums of money for their data. The courts struck it down saying they could only charge a nominal fee to cover the cost of copying the data and delivering it to the requester.
Having started my career in GIS working in or with local governments, I can say without hesitation that a lot of the problems are just plain turf battles between groups. It’s especially bad in small towns and county seats in largely rural counties where the personnel may not be aware of the state regulations or they just plain don’t want to help out some other person in another office.
I guess some of that happens probably nearly everywhere but it was especially bad in those small towns.
I don’t know if it is still the case, but one of the conditions of acceptance into the Google Transit “program” used to be that the content remain accessible by Google through a non-password-protected url. Find that url, and you’ve found the data — with known filenames and schema. (Yes, if the muni’s admins are smart they’ve restricted access to Google IPs, but they probably didn’t bother.)
For some transit agencies, the problem is that they don’t want people knowing too much about transit. Transit cuts are extremely controversial around here, and often times implemented with little warning. A pretty significant part of the metro region wants transit shut down for good; probably more than the numbers who want increased transit. Opening up access to the transit information might unable those who can use that information for access to transit, but it certainly opens up to those people who want to use the information to lobby for increased cuts in public transit funding.
If you look at the cities that have opened up their transit data, they have metro populations that are highly supportive of public transit. But when you look at metro transit authorities that are under siege, like St Louis, Detroit, Ann Arbor, Chicago, Twin Cities, Colorado Springs, and Phoenix, you are seeing Google Transit access (if that), but not much more. Google Transit acts as a protective filter around their data.
Transit agencies need to be educated on how open data benefits them. Here’s some amo:
http://www.trilliumtransit.com/blog/tag/open-data/