The Fear of Open Source Data and the Power of a Data Network Effect
April 20th, 2007by Sean Gorman
I spent the last four days out in San Francisco at the Location Intelligence conference, which was a great time but wish I could have done the Web 2.0 Expo as well. Lousy scheduling. While there were lots of interesting technologies presented at the conference my biggest take away from the conference was the tension between traditional geospatial vendors and emerging Web 2.0 technologies and start ups. Actually Web 2.0 and mashups are the wrong labels because now every vendor claims they are Web 2.0 and enabling “enterprise mashups”. Of course you are using their proprietary closed application, with proprietary data and not enabling any kind of open community discourse, but I digress.
The tension that was palpable at the conference was between open source things that are free and enterprise things that cost herds of money. While this is nothing new for software the issue is now data. The fear and sometimes animosity towards open source data really showed itself in the comments of enterprise people when asked about OpenStreetMaps.org. For those of you not familiar Open Street Maps (OSM) is a project to provide free street data for web mapping apps. In the US we take free TIGER line data from the Census for granted but every where else in the world you pay big bucks for street data, especially in the UK where OSM started. OSM has turned into a thriving community rapidly mapping large chunks of geography with quality street map data.
When the enterprise folks were asked about OSM they were dismissive in a rather hostile way. Statements like “30 inches matter in street maps and can be the difference between life and death. Do you want to trust open source data for this” or “we spend millions of dollars creating the worlds most accurate data - you can’t duplicate that with a community of volunteers”. The entertaining response from open source folks was “how many times has a proprietary street map led you to the wrong place, not had your house on it, missed a road etc.”….”how successful were you in getting that changed in the system”.
The interesting bit was the proprietary enterprise folks singing an O’Reilly mantra “Data is the Intel inside” for Web 2.0 and the future. “All those applications that use publicly available data are gonna be sunk.” While I agree that data and content are going to huge drivers of value I do not believe that walled gardens of proprietary data is going to be the future.
Data benefits from Metcalfe’s law just as any application or technology does. The power of one data set pales in comparison to a network of data sets that can be mashed together to create new datasets and relationships.  Also using a community to police data has all the benefits of open source software. The big difference is that open source data creates a much larger community than open source software. The number of people who can work code versus the number of people who can use and look at data creates a whole new game. Blogs and wikis started the two way conversation, and there no reason with some hard work and innovative technology this cannot be extended to data. OSM is a great example of early success as is Swivel. The acquisition of GapMinder by Google also points to this future. GapMinder is all about extracting public open source statistical data and making it available to the masses.
I think this is really at the heart of why people are struggling to find lasting value in mashups, especially map mashups. You end up with a lot of proprietary data silos that are not connected. I can look at tons of different map mashups with different data but there is no way to combine them to solve problems. The more data you can interconnect the more problems you can solve and the emphasis and value comes in the technology and not walled gardens of over priced proprietary data.
Popularity: 7% [?]





