About the Author:  Sean Gorman founded FortiusOne in 2005 to bring location based analytics to the mass market. Sean brings over 10 years of experience at the forefront of the geospatial revolution as a researcher, practitioner, and entrepreneur at FortiusOne. Through both academic and entreprenurial efforts he has been working to make geographic data more accessible to the public since 1997 culminating in the creation of GeoCommons – a crowd-sourced repository of statistical data and social feeds that can be easily mapped, remixed and reused by non-technical users. Sean has been featured in media such as, Wired, Der Spiegel, ABC, Washington Post, Business 2.0, MSNBC, CBS and CNN. He also holds a PhD. From George Mason University in Public Policy where he was the Provost’s High Potential Scholar and was the recipient of the Fischer Prize. He has published dozens of articles on geographic data sharing and analysis, and authored the book Networks, Complexity and Security: The Role of Public Policy in Critical Infrastructure Protection. Read more from this author


A few weeks ago James Fee wrote a blog post about a debate he had with Steven Citron-Pousty concerning the usefulness of GeoCommons. From a high level the argument came down to the GIS vs. Neogeogrpahy debate. There were great quotes on both sides like “freaking sweet” in support and “pretty worthless” in the bashing category. Over all we were excited to see that GeoCommons had started a debate in the GIS community. The intent of GeoCommons had originally not been to provide a resource to the GIS community, but to provide access to GIS data and a few tools to the rest of the world. The GIS community always had access to the data and the tools, so I had figured GeoCommons would not even pop up on the radar. In hindsight I think we should be providing mutual resources to each other, so with that in mind here are few thoughts on the topic.

Once I had some time to really read through the post and the various comments for and against GeoCommons it seemed the core of the grumpiness was over metadata, more accurately the lack of it. Metadata is an important part of the GIS community with the goal of making data interoperable for projects like the EIS’s James Fee mentions. Interestingly it can be argued that metadata standards arose out of criticism of GIS technology by Geographers back in the day, “GISs are only tools, perhaps bad tools, because they are ‘simply big black boxes, slick, simple and utterly incapable of dealing with the sort of matters that are truly important’ (Curry 1994: 441-442).” Did metadata assuage Curry’s criticism, not really but it does sound awfully familiar to the criticism of web 2.0 mapping technologies we are hearing now, ironically from GIS people. So what exactly is GIS metadata ? I’d paste an example into this blog post, but it is 11 pages long, so I’ll just post up the link to the metadata for airport locations in the USA from the USGS National Atlas. From a technical stand point I understand why each bit of the information in the 11 pages is needed, or more accurately would be dictated by a standards body. While this is all well and good for some users, let’s face it how many people are going to read it or worse fill it out before they share content. So, the question in our mind is what is the minimal amount of metadata needed to make data useful and also opens up the process to as many people as possible? Our current answer for the same data set can be seen here.

We believe in the “less is more” mantra, and thus far that has held to be true at least in data standards. The simple and short specification of KML has resulted in over 8 million pieces of data on the web, the thorough but very very long specification for GML has resulted in roughly 800 pieces of data on the web (these are second hand statistics from a conversation, but you get the idea even if the numbers are not likely exact). We originally wanted to use GML as the standard on our system, but there just was not a market for it on the open web. So, now we are working on letting KML support featured attributes through the schema tag.

In the Web 2.0 world metadata has largely taken the form of “tags” – simple keywords describing the referenced object. There is an increasing amount of data that is being generated through web 2.0 technologies, and with free geocoding, scraping technologies, and geo conversion applications like G-Swivel, to name a few, there is only going to be more of this data. There is a solid argument this geographic data needs more than just tags, but do we need 11 pages of metadata? Let’s look at the two extremes 1) KML search in Google 2) GIS data search in the Geospatial One Stop. To keep it simple and some what fair we’ll stick with “airport” example. So if we do a search in both for airport we get:

Google – KML search “airport”

GOS – search “airport”

Google pros – it is incredibly easy to get the data and immediately view it in Google Maps or Google Earth.

Google Cons – the data is not particularly useful (three pushpins of Chicago Midway Airport) and you really do not know what the data is until you download it and view it in Google Earth.

GOS pros – produces useful search results (airports from the National Atlas, airports in Illinois etc.) and good descriptions, varying in quality, of what the data is before you download it. Although the value mostly comes from the abstract and reading the attribute definitions in the metadata is rather torturous.

GOS cons –The majority of query results only provide a summary and a list of metadata. The data itself is most often missing and the few that have data sets that can be added to their web app usually crash it or time out.

So, the question I pose to both the GIS community and the neogeographers is how much metadata is enough? Ideally geospatial data should not be any harder to find and consume than a local search. I would argue that the current KML search is not enough, and what is required in traditional GIS is far too much for the vast majority of people. Our answer to date has been to leave it as open as possible within GeoCommons. If you want to paste in all your traditional metadata into the description field you are free to do so. This could be the wrong approach. Maybe it is better to make the application more welcoming to the GIS community by providing an option to link directly to the source metadata of what you are uploading? In the bigger picture outside of GeoCommons we’ve been advocating the ability to add structured features to KML, so that more details and attributes about a data set can be specified. Between working on the next release to make the upload process easier and the KML work we would love to get feedback on how much metadata is enough. If the proble is – not enough metadata – I think the community can solve it. If the problem is – the rest of the world should not be mucking around in geospatial data because you need to be a trained GIS professional to do so – we have a different conversation on our hands. I really believe there is a great opportunity to create a bridge between the GIS and web 2.0 worlds to everyone’s benefit, and hopefully we can start a dialog that moves towards that. If not bring on the flames.

Popularity: 10% [?]

3 Responses to “How Much Metadata is Enough: Is there a GIS – Neogeography Middle Ground?”

  1. Bill ThorpNo Gravatar Says:

    OSGeo came up with a DCLite4G spec (http://tinyurl.com/kfkyv). Its hardly a standard, but its a decent fit for neo-geo. You could use the dc:source tag to link to some beefy metadata, if that info existed.

    Another option is the OGC_CORE minimum feilds for an OGC Catalog Service. Core queriables: subject, title, abstract, anytest, format, identifier, modified, type, boundingbox, crs, and association (target source relation). Core returnables: title, creator, subject, description, publisher, contributor, date, type, format, identifier, source, language, relation, coverage, and rights. See section 6.3 of the spec:(http://portal.opengeospatial.org/files/?artifact_id=20555).

    This section option might get you further into “don’t blame me” territory, especially, again, if you allow linking to some beefy metadata record using “source.”

  2. Bill ThorpNo Gravatar Says:

    Wow. Where are all the other comments?

    Virtually no one wants the public to be creating ISO or FGDC metadata.
    On the other hand, we’d all like to think that if Joe FortisOne wants to build something useful, that we can have some interoperability guidance, rather than unqualified criticism.

  3. Sean GormanNo Gravatar Says:

    Good suggestions we will have to check them out. A formal structure that would allow us to link to traditional Metadata would be great. Hopefully we will get some more feedback and comments. It will be interesting to see if geoweb Metadata standards come out of the standards bodies or out of the community as with KML and GeoRSS.

Leave a Reply