ETech Day Three – Elephants, Fire Eagles and Disaster Tech
March 5th, 2008by Sean Gorman
I got a bit wrapped up trying to get a side project finished up yesterday, so I’ll just skip to day three of ETech. The morning opening speakers were better that Day Two, although the session thus far have been a bit below Day Two’s. We kicked off the morning with an abbreviated talk by John McCarthy (father of LISP) on a new language he’s working for several years called Elephant. The elephant name coming from the fact it never forgets, and the broad concept of a semantic programming language that can create structured relationships from natural language. Unfortunately he ran out of time before he really got into the guts of it, but there were some fascinating concepts with how natural language can be leveraged in a structured way to do computation. Definitely something worth looking into more, and it reminded me a lot of our thoughts about a context driven architecture and natural language for data. Although we were looking to turn quantitative data into natural language versus turning natural language into data.
Following McCarthy’s talk there were some interesting bits on open source personal robots, then an informal launch on Yahoo’s Fire Eagle. Fire Eagle has taken some flack in the blogs for having minimal or “zero” functionality. I think this misses the point of what Fire Eagle is intended to do. My impression was that Fire Eagle is not meant to be a stand alone consumer application but a straight forward tool that does a simple thing very well. That simple thing being a platform for sharing your location online. The functionality folks are clamoring for is left to the users and developers and I think there are good number of fun possibilities here. For instance with GeoCommons we have big pile o’ data and would be very useful to personalize that data delivery to a users location, or have user have the ability to comment on that data from their location and have that comment geo-located. This creates a dependency on clever users, but form what I’ve heard floating around ETech there seem to be a good number of clever ideas floating around.
The last session of the day I attended was Mikel and Jesse’s presentation on “Disaster Tech”. I’d seen Mikel’s presentation at the State of the Map conference on open source disaster technology, and it was cool to see how the project has evolved. The whole topic is something close to us, especially getting up close doing disaster response after the London Bombings and Hurricane Katrina. The presentation has some great examples of Open Street Maps, Twitter and Google Maps being used in creative ways during disasters. Mikel gave a nice example of using the USGS GeoRSS earthquake feed, the EU lightweight tsunami propagation model and a feed to republish the resulting polygons as GeoRSS. With this approach they can churn out a polygon warning area in under a minute. A similar concept is seen at the United Nations – GDAC application.
All great stuff for ad hoc implementation that is cost effective and not over engineered. Lots of good discussion of how take the information produced by technology and effectively transmit it to non-technical or completely unconnected people. Also Jesse and Mikel had a nice bit at the end of the presentation on anti-patterns – i.e. what happens when you don’t have a champion for the technology to create repeatable and successful implementations. Specifically the case of the search for Steve Fosset where the crowd sourced help to find him actually slowed down the search and rescue teams having to deal with all the input. Resulting in the emergence of champions like InternetSAR that creates a structure that could be replicated and effective for search and rescue. Lots of good thought on an important topic
Popularity: 14% [?]
GeoServer Map Collaboration Tools – “NYC Street Maps”
December 6th, 2007by Sean Gorman
We’ve been doing work recently integrating GeoServer with GeoCommons to provide more hooks and capabilities for our platform. I was catching up reading the GeoServer blog and saw a new demo they had going to demonstrate their map annotations tools in development.
The map only has a base street map for NYC, but the annotation features and presentation is quite nice. You can add annotations and pictures to the map and all works very smoothly. The ability to create annotations and layer them on top of structured data like crime rates or toxic release points is very compelling. Then users can not only see where a statistical phenomenon is happening but also comment, including confirmation or criticism. For instance add a photo of dead fish in green bubbling ooze at a toxic release point.
We had some fun with the concept about a year ago after a trip to NYC mapping the location of bars and single women then testing out the hot spots. Less altruistic than the example above but again demonstrates the value of adding qualitative comments to quantitative data. For fun I added the heat map we made of the bars and singles to the GeoServer demo. If you go to the lower east side it is the yellow marker on 6th St.
Look forward to seeing if we can make use of the new GeoServer collaboration tools and props to them for all the good work.
Popularity: 15% [?]
Data Should be the Intel “Outside”: The Power of Data Network Effects
October 19th, 2007by Sean Gorman
The folks at Puhpin had a great comment they posted to our last blog entry on “free public data“. I thought there was enough interesting content to expand on the comment thread with another blog post. The Pushpin team did a great job providing far more nuanced thoughts on the issues of “for fee” data. At the end of the day my issue is truly with the government/s for not providing the data in easy to use formats or even open standard non-proprietary formats. In an open market anyone is free to take that government supplied data, make it easy to use, and charge a price the market is willing to pay. In addition to making the data easy to use many vendors also add an additional layer of quality assurance and many times value added data derivatives like forecasts.
There are many instances where vendor supplied data is truly value added and worth the money an end user pays, but there are also situations where it is not and there is a better alternative. Take for instance the 2000 Census data ESRI provides to Pushpin to resell – the added work there is taking the boundary files provided by Census and joining them to the data tables provided by the Census. I’ll be the first to admit it is tedious to do all the database joins, and it requires having pricey GIS software, but in my opinion the ratio of value add to price is way out of wack.
That is the philosophical difference with GeoCommons. If you have a community of people willing to put in that little bit of work to extract the data from places like Census and share it with the community you get a network effect. Since the data goes in under Creative Commons, anyone can take that data and combine it with their data or anyone else’s contributed data. Allowing any user to make something new and innovative with the collective data. Anytime you work to create a dataset/database there is value created and work done. Every member of OpenStreetMaps GPS-tracing roads has put in solid sweat equity, but they choose to contribute that to the community because the collective value of that data is far greater than its value alone.
In the end I believe this helps the data vendors because there is more data the market can mashup with the vendor data (vendors benefit from the network effect also). There is also a larger market of people that realize the value of the data because the barrier to entry to experience it has been removed. That said, I believe it also means the data providers are really going to have to add true value and not just do a few database joins. The real value comes in the technology and not the raw data itself. The data is what enables the technology to be more valuable.
Tim O’Reilly states that one of the key value drivers for Web 2.0 is “Data is the Intel Inside“. Specifically O’Reilly cites NAVTEQ’s proprietary database of streets as a big value drivers for many GeoWeb applications. I agree that databases (i.e. SQL is the new HTML) are creating new value propositions, but now the value is having data on the “outside” not the “inside”. The walled proprietary gardens of “inside” data are being trumped by open source “outside” data that allows a network effect to be created. With data on the “outside” not only can new combinations (data mashups) be created, but the data itself can adapt (like OpenSteetMaps and TomTom). In response to Brady’s post on the Nokeia acquisition of NAVTEQ O’Reilly comments, “the real question is going to be whether there’s a web 2.0 answer (i.e. a user-generated content) answer to the expensive data development and curation currently employed by Navteq.” I think the answer is a resounding yes and as standards like KML 3.0 progress and technologies evolve around them, the power given to the user so they can contribute meaningful data and context is only going to increase. The real value is in the technology that allows the data to be delivered, mashed up, and interconnected.
Popularity: 14% [?]
Crowdsourcing to Create Resilience: Why Security through Obscurity will Never Work
October 15th, 2007by Sean Gorman
NPR ran a story on Monday’s Morning Edition entitled “Security Officials Seek to Block Some Online Maps”. The story centered around local government officials refusing to release electronic maps of what they call “critical infrastructure,” such as water mains and fire hydrants. Specifically the story of Steven Whitaker’s futile quest to obtain infrastructure data from the Greenwich, CT local GIS repository. As part of the story NPR came by to ask my opinion on the matter because of our history of creating security concerns using open source data.
The story has a nice quote of me saying it was an impossible task to try and control all the geodata out there and who has access to it. The part that did not air is that no one even knows what data is accessible and not accessible to the public. While we do have a good index and census of most of the web pages that exist, we have much less understanding of the databases including geospatial databases connected to the Web (often called the Deep Web). The indexes run by Google and others do a great job finding web pages but databases are a different game. A Cal Berkley study by Bergman found that, “the deep web consists of about 91,000 terabytes. By contrast, the surface web, which is easily reached by search engines, is only about 167 terabytes.” While it is uncertain how much of this data is geospatial in nature it is fair to assume it is a considerable amount of data that we largely have little clue about. Often times government agencies do not even realize what data they have online available to the public, and we definitely do not have a comprehensive way to understand the entire universe of geospatial data. What raised so much alarm with our original research were the authorities realizing that that the data was available open source. Everyone clamored the work should be classified, but the source data is all still out there hidden in myriad local, state, federal and NGO data repositories. This begs the question, how are we going to control a world of data that we have so little comprehension of?
In order to move towards greater security I believe we actually need to open up more so that the entirety of geospatial data can be indexed. We will have no true idea as to what geospatial data available to the public is potentially dangerous until know what is out there. The move towards making KML an OGC standard is a great first step as a standard geospatial data format for the Web. Although KML natively is geared towards providing a geographic framework for text, html, pictures etc., and not structured information like databases. We’ve been working on changing that by ensuring a mechanism exists by which to include feature attribute data in the schema tag of KML . Some of this work has carried over into KML 2.2 as “extended data“.
Once you begin to index the geospatial data out there you are in a much better position to have a logical debate about what data is a security threat and what data contributes to the public good. For instance you may want to know where there have been hazardous pipeline accidents, but not divulge where critical pipeline routing junctures are. By opening up geospatial data, not only do we have a foundation to better insure dangerous data stays out of the hands of bad guys, but we also have the positive externality of a whole wealth of data being made available to the public to solve a wide range of problems.
Popularity: 13% [?]






