About the Author:  Sean Gorman founded FortiusOne in 2005 to bring location based analytics to the mass market. Sean brings over 10 years of experience at the forefront of the geospatial revolution as a researcher, practitioner, and entrepreneur at FortiusOne. Through both academic and entreprenurial efforts he has been working to make geographic data more accessible to the public since 1997 culminating in the creation of GeoCommons – a crowd-sourced repository of statistical data and social feeds that can be easily mapped, remixed and reused by non-technical users. Sean has been featured in media such as, Wired, Der Spiegel, ABC, Washington Post, Business 2.0, MSNBC, CBS and CNN. He also holds a PhD. From George Mason University in Public Policy where he was the Provost’s High Potential Scholar and was the recipient of the Fischer Prize. He has published dozens of articles on geographic data sharing and analysis, and authored the book Networks, Complexity and Security: The Role of Public Policy in Critical Infrastructure Protection. Read more from this author


Dealing with Data Accuracy in the GeoWeb: The Day China Annexed Taiwan

One of big gremlins in Geography that often gets over looked is data accuracy. On the surface data accuracy and validation is a pretty dull topic, but when data is wrong or controversial the repercussions can be huge. Google Earth has been a lightning rod for much of the controversy, especially when it comes to the definition of borders between countries. A border controversy in Chile over inaccurate data in Google Earth created a large media stir as did a disagreement between India and Pakistan over the border in the Kashmir region. Interestingly Google outsources the whole volatile issues of borders to Europa Technologies who “supply the international borders, national borders, coastlines, airports and places data featured in Google Earth.” When you have 250 million downloads it is hard for Google Earth to not be a lightning rod, and the truth is there are data accuracy issues all across the GeoWeb and traditional GIS.

GIS has a long history of tackling issues of data accuracy and have created a set of standards like “National Standard for Spatial Data Accuracy” as part of official FGDC metadata. While these standards worked for dealing with data accuracy in traditional GIS, are they sufficient or appropriate for the GeoWeb. Further are there capabilities emerging in the GeoWeb that could help make the entire world of geographic data more accurate. OpenStreetMaps has made a strong case for “crowdsourcing” creating more accurate and timely maps (if not more complete) than tradition GIS approaches. Is there potential in using crowdsourcing to make all geographic data more accurate and timely. Is there a possible symbiotic relationship with existing metadata standards to harness the historical diligence of GIS metadata with the real time responsiveness of crowdsourcing?

I think part of making this leap is recognizing that there is a need for it. Just because geographic data has official metadata and comes from an official source does not mean that it is accurate or current. Take Mikel Maron’s well documented example of a New Orleans bridge destroyed by Hurricane Katrina but remaining as a geographic entity in NAVTEQ and TeleAtlas official datasets (as exposed through Google and Yahoo maps). We recently came across a similar example on GeoCommons. An open source data set of official country boundaries for China released by a large GIS vendor for 2006 included Taiwan as part of China.

china_boundaries_2006

Looking at the 2002 dataset Taiwan was not included as a province of China:

china_boundaries_2002

Either it is a mistake or the data provider is following the claim by the People’s Republic of China that Taiwan belongs to China. Google has come under pressure for recognizing Taiwan as a province of China, so there is a case to be made.

Politics aside data accuracy and timeliness is a gremlin that is going to plague all parties in the geospatial space be they GeoWeb, GIS, or increasingly a fusion of the two. My question is in the fusion of the two are their opportunities to help both sides out. OpenStreetMaps has presented an approach for increasing the timeliness and relevance of geographic data through crowdsourcing, but the commercial data providers often quip about spatial accuracy and completeness. That said the number of times I’ve used those data sets and been sent down a road that does not exist or sent to a point of interest (POI) that is mislocated is far from rare.

TomTom has been one of the more proactive vendors along these lines allowing users to correct mistakes they find in directions and POI s. Their acquisition of TeleAtlas is a strong indicator of the direction this all takes, and is only further reinforced by Nokia’s acquisition of NAVTEQ. As GIS and the GeoWeb further converge can the crowd help make the world of geospatial data richer, more accurate and more timely? OpenStreetMaps and TomTom have show it can work and reap huge benefits for streets and POI’s, but what other varieties for data or analysis will work? What is next?

Popularity: 3% [?]

Leave a Reply