GIS Version Control

Update 2: GeoGig is not dead, see here.


Update: It appears that Boundless abandoned GeoGig to the Eclipse Foundation.  Currently it shows no work has been done on GeoGig in the past 12 months.  Time to assume our only hope is Github itself.


The crazy thing about GIS is that we never really take into consider version control with our data.  Well we have a workflow, it usually entails putting a one at the end of a file name or calling it “temp” or “trash” while it works through analysis.  Whenever I used to take over a project from someone, I’d take a quick look at the project folder and you’d see hundreds of seemingly orphaned GIS datasets just littering up the folder structure.  And of course no documentation as to why there are there, how they were created and what the derivative products were created off of them.

When I joined WeoGeo many moons ago, that was one thing Paul Bissett always harped on and was something WeoGeo was trying to solve.  WeoGeo was approaching it from a data sales marketplace end where data providers wanted to know what derivative works were being crated off their data.  But for users, the same process was needed.  We tried to pitch it to users but generally they didn’t see the value in keeping track of their datasets.  I think this was shortsighted and I still believe that WeoGeo should have been the choice of every GIS professional to maintain an authoritative data library.  But alas, GIS professionals didn’t care.

We’ve watched GeoGig by Boundless (It appears that they passed it on to the Eclipse Foundation last year.  I hadn’t heard this, nor does the website show the update). for years, waiting to see if it will succeed.  It provides the kind of revision control we’re used to with programming.  Boundless has a pretty good graphic below that shows what’s going on from a GIS perspective.

Bingo right?  Well wrong… While GeoGig is actually very impressive and takes into consideration all those weird things us GIS folks do, it won’t ever go anywhere.  Without it being integrated into QGIS and ArcGIS Desktop users won’t be able to integrate it into their workflows.  It is the same problem we ran into with WeoGeo Library, it’s just too hard to integrate it into ArcGIS Desktop without Esri doing it themselves.

But the QGIS tie is interesting.  Boundless is behind GeoGig.  They are also a big supporter of QGIS.  GeoGig seems more tied in with Geoserver right now but editing is the big reason for GeoGig and let’s be honest.  Most editing happens on the desktop.  Boundless has been showing that QGIS and GeoGig work together but as I said above.  Unless it is natively integrated into QGIS, it won’t have more than a niche uptake.  At this point though, GeoGig is really our only big hope.

Esri has versioning on their geodatabase but it’s a nightmare.  I’ve never had good luck with it but I’m willing to chalk it up to me not knowing a thing about what I’m doing.  Geo-data is complex, Git works because it is so simple.  Tracking simple changes in text for files.  They have a hard enough time working with GeoRSS but you can see that it does work.  I have to often wonder if GitHub might have GIS version control before GIS people get it working.  Honestly that is what we want right?  Github for GIS?