About the Author:  Sean Gorman founded FortiusOne in 2005 to bring location based analytics to the mass market. Sean brings over 10 years of experience at the forefront of the geospatial revolution as a researcher, practitioner, and entrepreneur at FortiusOne. Through both academic and entreprenurial efforts he has been working to make geographic data more accessible to the public since 1997 culminating in the creation of GeoCommons – a crowd-sourced repository of statistical data and social feeds that can be easily mapped, remixed and reused by non-technical users. Sean has been featured in media such as, Wired, Der Spiegel, ABC, Washington Post, Business 2.0, MSNBC, CBS and CNN. He also holds a PhD. From George Mason University in Public Policy where he was the Provost’s High Potential Scholar and was the recipient of the Fischer Prize. He has published dozens of articles on geographic data sharing and analysis, and authored the book Networks, Complexity and Security: The Role of Public Policy in Critical Infrastructure Protection. Read more from this author


It has been great to see a dialog emerge on open data surrounding ESRI’s merging of Geodata.gov and Data.gov. I think Marten’s post title sums up my general consensus – “confusion”.

Every person I’ve talked to and every post on the topic has a different version of the story. Further, the story changes as the community asks hard questions. It is all a bit murky, but I thought this was a good opportunity to take a step back and encapsulate what the community is looking for with open data. In this case I’m going to lump together the geo-community and the broader data community because the two are converging at a rapid pace.

I’d posit there are some very simple litmus tests any government agency can put against their projects to determine if they are truly opening data:

1) Is there a download button?

In order for any data to be open you need to be able to download the data so that you can remix, reuse and share the data. Data and the government agency that supplies it are not transparent if you can’t download the raw data. PDF’s and web services don’t count. They can be useful additions to the raw data, but they are not a replacements.

2) Data should always be linked to the derivative works created with it.

Take Geoplatform.gov – a nice mash up of data from Data.gov but I have no way to get back to the data to re-purpose it in a useful way other than a few links to PDF’s – the scourge of open data.

geoplatform_no_download

You’ve highlighted great open data, now let the community get to it easily. It is like running a great advertisement and then giving your consumer no easy way to buy the product. All government mashups – map and otherwise should have mandatory link back to the raw data for download.

3) Downloading should never be more than two clicks away (ideally one).

Whether it is Geodata.gov or any other government site you should always be able to quickly and clearly get to raw data download easily. It does not matter whether you are a GIS user or Joe Six-pack finding the content you want should be a simple process. In the recent open data discussions folks have been holding up the features of Geodata.gov as the standard, but I think we can do better:

geodata.gov.workflow

Five clicks (after you find the advanced search) and navigating a lot of text is a tough task even for a GIS professional.

4) When you get to the data it should be available in multiple formats and be easy to use (i.e. data dictionaries).

Users come in a lot flavors whether they are geo or non-geo. They should have a selection of data formats to work with. There are lots of great file format transformation tools out there both open source and licensed. While easy to use is subjective. Let’s start with demonstrating how our current options are not easy:

geodata.gov.download

Following the last example I did a search for demographics on Geodata.gov and got a result for “200 county demographics”. I received an option to download a PDF or a zip file. PDF is where data goes to die, so obviously I went for the zip file. I unzipped the file and found a .csv. Cool but not really geodata. All the same I opened it up to find lots of data but no attributes for the columns. Annoying, but no big deal I’ll shoot back to the metadata – the essence of Geodata.gov. There is a list of attributes and definitions but no way at all to map those back to the spreadsheet. Rendering both the spreadsheet and the metadata useless.

5) The data should be searchable and portable.

It was great to get the clarification that ESRI will be implementing the OGC CSW standard. A standard is always better than a proprietary format or approach. Folks may disagree but I’ll go a step further and say a Web aligned standard is most often better than a GIS standard. The CSW spec took us 14 days to implement and Opensearch took a little over a day. The CSW spec is over 200 pages and the OpenSearch spec is a couple of pages. Good news being the OGC is considering adopting OpenSearch (only 24 pages ;-) . In our experience Opensearch has allowed the community to create a large number of simple interfaces into Geocommons – including ArcMap, Google Earth, Jive, Windows 7, Chrome, IE8 and Firefox.

There are a lot of other great things that can be added on top of these open data basics, but they are not substitutes. Web services and API’s are all great things, but in my mind they come after the basic data access. Creating sophisticated mash-ups and dashboards with data are also very useful, but if the data is not also directly available (a download button) fail the litmus test.

There is a ton of potential in opening up government data for better transparency, greater innovation and sustained economic growth. In order to capture that potential though we must truly open data, and not let the idea be subverted or hijacked by special interests.

Lastly, after revisiting Geodata.gov today and trying to actually work with it I can only come to one conclusion. If we truly care about opening up government geo-data – rebuild the whole thing from the ground up. I’m sure it was great in 2004, but the Web and GIS has moved on.

Popularity: 13% [?]

5 Responses to “An Open Data Litmus Test: Is There a Download Button”

  1. James Fee GIS Blog » Blog Archive » Shapes, Intellectual Property Rights and “Where’s My Desktop App?” Says:

    [...] in technology and open standards, I am surprised that the items Sean Gorman discusses in this post are still an issue with respect to federal government [...]

  2. short links: open data, candian census, and merchants of doubt | Resilience Science Says:

    [...] An Open Data Litmus Test: Is There a Download Button from Off the Map In order for any data to be open you need to be able to download the data so that you can remix, [...]

  3. L’opendata dans tous ses états – Juillet II « Says:

    [...] An Open Data Litmus Test: Is There a Download Button [...]

  4. UNdata and GeoJoin | Off the Map - Official Blog of FortiusOne Says:

    [...] Offices of Member Countries to strengthen their data dissemination capabilities.” Clearly meeting “An Open Data Litmus Test” there is a download button. As soon as I discovered UNdata I began searching for data and using it [...]

  5. Dave SmithNo Gravatar Says:

    * Download button, great first step. But data changes, and to have to manually download and remix the data with every change is a PIA. So, beyond (but not in place of) the download button, let’s see APIs and feeds to take the drudgery out of keeping apps up to date, along with removing the question of whether the app is even reflecting the most current circumstance.

    * Metadata – and is it actually current? What is the data actually showing, is the data complete, what time span does it cover, what does it really mean? And – if you are downloading and “remixing” it, whaddya mean by “remixing it” – you are now putting up derivative works, and you rightly should also be providing metadata that describes what you started with and what you did to remix it.

    * GeoPlatform.gov – Should be able to get the metadata and data easily – maybe via hotlink to a metadata/description/download page right in its’ layer tree.

Leave a Reply