Dealing with Data Accuracy in the GeoWeb: The Day China Annexed Taiwan
October 5th, 2007by Sean Gorman
Dealing with Data Accuracy in the GeoWeb: The Day China Annexed Taiwan
One of big gremlins in Geography that often gets over looked is data accuracy. On the surface data accuracy and validation is a pretty dull topic, but when data is wrong or controversial the repercussions can be huge. Google Earth has been a lightning rod for much of the controversy, especially when it comes to the definition of borders between countries. A border controversy in Chile over inaccurate data in Google Earth created a large media stir as did a disagreement between India and Pakistan over the border in the Kashmir region. Interestingly Google outsources the whole volatile issues of borders to Europa Technologies who “supply the international borders, national borders, coastlines, airports and places data featured in Google Earth.” When you have 250 million downloads it is hard for Google Earth to not be a lightning rod, and the truth is there are data accuracy issues all across the GeoWeb and traditional GIS.
GIS has a long history of tackling issues of data accuracy and have created a set of standards like “National Standard for Spatial Data Accuracy” as part of official FGDC metadata. While these standards worked for dealing with data accuracy in traditional GIS, are they sufficient or appropriate for the GeoWeb. Further are there capabilities emerging in the GeoWeb that could help make the entire world of geographic data more accurate. OpenStreetMaps has made a strong case for “crowdsourcing” creating more accurate and timely maps (if not more complete) than tradition GIS approaches. Is there potential in using crowdsourcing to make all geographic data more accurate and timely. Is there a possible symbiotic relationship with existing metadata standards to harness the historical diligence of GIS metadata with the real time responsiveness of crowdsourcing?
I think part of making this leap is recognizing that there is a need for it. Just because geographic data has official metadata and comes from an official source does not mean that it is accurate or current. Take Mikel Maron’s well documented example of a New Orleans bridge destroyed by Hurricane Katrina but remaining as a geographic entity in NAVTEQ and TeleAtlas official datasets (as exposed through Google and Yahoo maps). We recently came across a similar example on GeoCommons. An open source data set of official country boundaries for China released by a large GIS vendor for 2006 included Taiwan as part of China.
Looking at the 2002 dataset Taiwan was not included as a province of China:
Either it is a mistake or the data provider is following the claim by the People’s Republic of China that Taiwan belongs to China. Google has come under pressure for recognizing Taiwan as a province of China, so there is a case to be made.
Politics aside data accuracy and timeliness is a gremlin that is going to plague all parties in the geospatial space be they GeoWeb, GIS, or increasingly a fusion of the two. My question is in the fusion of the two are their opportunities to help both sides out. OpenStreetMaps has presented an approach for increasing the timeliness and relevance of geographic data through crowdsourcing, but the commercial data providers often quip about spatial accuracy and completeness. That said the number of times I’ve used those data sets and been sent down a road that does not exist or sent to a point of interest (POI) that is mislocated is far from rare.
TomTom has been one of the more proactive vendors along these lines allowing users to correct mistakes they find in directions and POI s. Their acquisition of TeleAtlas is a strong indicator of the direction this all takes, and is only further reinforced by Nokia’s acquisition of NAVTEQ. As GIS and the GeoWeb further converge can the crowd help make the world of geospatial data richer, more accurate and more timely? OpenStreetMaps and TomTom have show it can work and reap huge benefits for streets and POI’s, but what other varieties for data or analysis will work? What is next?
Popularity: 3% [?]








Leave a Reply