Data Dissemination to the Government of Haiti
February 5th, 2010by andrew
Historically, these tools have been useful for the public and media to cover an event - but the question often arose if they were beneficial to responders and citizens on the ground. It's been clear that tools such as OpenStreetMap have had a clear and positive impact on the response and recovery efforts. Organizations from the World Bank, to MapAction, to the Fairfax Search and Rescue teams have all been using OSM as a primary means of maps and routing. OSM volunteers have even been performing rough damage assessement and IDP camp identification.
While these tools have been useful for responders, there was still a disconnect on the availability of this data to the Haitian Government to access and assess the situations. The best data previously available before the earthquake is primarily from 2002, and at the same time most ministries are either non-existent or severely understaffed.
Bringing the Commons
In response, at CrisisCamp DC on Saturday the World Bank initiated an effort to provide the massive amounts of data and tools directly to the government. The goal was to provide a browser, like the Haiti Crisis Map with the imagery and OpenStreetMap roads, data gathering and visualization tools like GeoCommons, public and quality checked sources of Hospital locations, camps, and damage assessment, and the World Bank's own flyover imagery on portable hard drives and onto the desk of the Haitian ministers.A major hurdle, however, was that there were no local copies of the raw imagery data. Thanks to the incredibly hard work of a number of organizations, especially Georgetown, SDSU and Internet2, we were able to move the 1.25 TB of raw data from the Hypercube server in San Diego to Georgetown's servers here in DC in a little over a day. These were loaded onto a series of 6 USB, self-powered hard drives. We chose the self-powered drives so it didn't depend on external plugs or fluctuating power sources.
In addition, we built an offline version of the Haiti Browser that can run by double-clicking an HTML file to run in a browser off of a hard drive or USB stick. You can get the code from here to see how it's done. We also included Delta State's MGRS Atlases for printing map books, and the GPS map images that can be used on Garmin handheld units.
There will also be several engineers deploying with the World Bank to assist in the dissemination of the data, working with the Haitian Ministries and President on using these tools and handling requests for more analysis and data with CNIGS (Centre National de l’Information Géo-Spatiale), Haiti's GIS department.
Enabling Self-Sufficient Government
The goal is to provide the Haitian government with their own capability of using the available tools for situational awareness and decision making. Beyond the immediate response and recovery efforts they are now enabled to utilize these tools for long-term reconstruction and infrastructure. The capabilities and tools should be sustainable and the goal is to close the data loop within the government as well as between external data collection and Haitian data needs and collection. We are moving beyond the traditional problems of leaving disasters without their own data or ability to be independent of the temporary organizations. The entire project is a model of how crowd-sourced data and tool development can have a beneficial impact both on remote support as well as directly to local citizens and government.
In the US and Europe, we're seeing a growing embrace of Open Government and transparency sharing data to citizens for engagement and collaboration. In Haiti, the community is performing a reverse-Gov2.0 - they're providing data and information to the government to respond, and rebuild their society.
A tremendous thank you to the numerous individuals and groups that helped and provided tools or data: World Bank, San Diego State University / Calit2, Internet2, Georgetown University, DigitalGlobe, Delta State University, Sahaha, Crisis Mappers, OpenStreetMap, NOAA, Ushahidi, DevelopmentSeed, TelaScience, STAR-TIDES, CrisisCommons, USAID, GeoCommons, OpenSGI, GeoEye.
Popularity: 9% [?]
Dataset of the Day: Holiday Travel Delays
December 23rd, 2009by margot
View full map
Popularity: 8% [?]
Why Geocoding Should be a Commodity
July 13th, 2009by Sean Gorman
Arguably the largest positive externality to the Web ecosystem that geospatial technologies can provide is creating more linked geo-enabled data. The beauty is the externalities work both ways. Not only does the Web get more useful content we also create more reasons for the public to use geospatial tools and software. Without the ability to geoereference data none of our collective mapping brilliance is terribly useful. Yet we put all sorts of obstacles in the way of the most basic geo-enabling capabilities - namely geocoding. We treat geocoding as a precious resource that needs to be metered and monetized. In short we put a strangle hold on the lifeblood of our business, geo-enabled data. Without geo-enabled content our relevance to the larger Web diminishes immensely.
The major providers all put restrictions around geocoding making it especially difficult to do batch geocoding operations to get large chunks of data geo-enabled. Google, Yahoo and Microsoft’s geocoders are all geared to single address look ups, and not for mass data geo-enablement. There are services like batchgeocode.com that get around some of the limitations but are still restricted by provider’s TOS.
The second big issue with current geocoding is further upstream. All the geocoding API’s are dependent on NAVTEQ, TeleAtlas’s and a few other providers data to geocode against. So, if the street data companies don’t think a country has a big enough market you can’t geocode in these areas. This especailly limits the ability to geocode data in developing countries.
Our thought is the best solution to this problem is an open source geocoder. There have been other open source geocoder projects, some of which have taken criticism as a bad business decisions.
We’ve taken a slightly different approach. One, we enlisted the brilliant help of Schuyler to evolve his work from Geocoder.us to best take advantage of the work and community already existing. Second, we decided to make the Geocoder street data neutral. Meaning that you can plug whatever street data source you want into the geocoder and have it work - sometimes with a bit of tweaking. In the first go we’ve set up the geocoder to work with TIGER data and NAVTEQ. We chose these two mainly because they both use all CAPS for their names.
The hope is that with the community’s help we can extend the geocoder to work with a large number of other data sources. As Andrew mentioned in his post OpenStreetMap is top of the list. Integrating OSM data will be key enabling geocoding in developing countries and other areas overlooked by current commercial providers. I think this is one of many areas where the OSM community is really going to show its power. While the geocoder is currently only accessible to developers through github, stay tuned because we’ll be exposing it as a web application in GeoCommons shortly. We want everyone to be able to geo-enable their data and access it in whatever format meets their needs. Data wants to be free and we all win when the gates are unlocked.
Popularity: 16% [?]
Dataset of the Day: Hockey, Getting Fans in the Seats
April 24th, 2009by Kevin Burke
The 2008-2009 NHL Season has been a thrilling one and it continues to be with the start of the playoffs. The game’s popularity has been growing and a rise in attendance figures has been a direct result. The Total NHL Attendance figure was broken this year for the fourth consecutive year. This news made me want to take a closer look at the data.
I first went to espn.com and looked at attendance figures from the 2008-2009 season. After looking over the stats I saw that some teams had regular sellouts and other teams struggled to fill the seats. The map below shows the percentage of seats that were filled throughout the season for each team. (click on the map for a larger view)
Why did some teams sell out every game while others showed poor attendance? I decided to investigate by using Finder! and Maker! to run correlations to determine why a team could or could not get fans in the arena.
The first thing I wanted to correlate was a team’s finishing place in the league and their attendance capacity percentage for the season. This is because a common theme in sports is that fans only go to watch a team if that team is winning. I mean who wants to go see the last place team in the league play.
The correlation shows some interesting results. It appears that the place of your team does not always affect the amount of fans you put in the seats. The correlation between the two factors was only .48 (high correlations are values close to 1 or -1). For example, the Ottawa Senators were able to fill 105% of their seats during the year yet they finished 22nd out of thirty teams in the league. Also, the Carolina Hurricanes who finished 11th in the league out of thirty teams only filled 88.5% of their seats (rated 10th worst in the league).
Now I looked at running some other correlations to see if any other factors resulted in getting people into the seats. Below is what I tried.
- Number of Consecutive Playoff or Non-Playoff Seasons (shows if a team has been continuously successful or unsuccessful)
- Unemployment % for February 2009 (If you’re broke and without a job, you probably won’t be spending your money to go to a hockey game)
- Average Temperature During Hockey Season (Hockey is a sport that is heavily followed in colder climates)
None of the correlations faired much better. Surprisingly Average Temperature During Hockey Season was the closest (-.59) This led me to the conclusion that it is a combination of different factors that determine if a team is able to get people in the seats for their games. Now I took several factors and gave them specific values and combined these to come up with the “The Kev Score”. I am hoping that “The Kev Score” will show how certain factors combined will determine if an NHL team will achieve their maximum attendance capacity.
Here is how I computed “The Kev Score”
Factors:
- Finishing Place (if in 1st place = 30 points, 2nd = 29 points, and so on)
- Temperature (Coldest City = 30 points, 2nd Coldest City = 29 points, and so on)
- Canada Factor (if a Canadian team you get 15 points added to your score)
- USA Hockey IQ Factor - if a USA city is known as a town known for hockey
o Good IQ (10 points added)
o Poor IQ (No points)
- City Population (Highest City Population = 30 points, 2nd Highest City Population - 29 points)
The Formula:
Finishing Place Points + Temperature Points + Canada Factor + Good USA Hockey IQ Factor + City Population Points = “The Kev Score”
The correlation between the Arena Full Capacity Percentage and the “Kev Score” is reasonably high at a score of .81. So is the “Kev Score” a reliable way to predict how to get fans in the seats. I decided to use the formula again but to test it with statistics from the 2007-2008 season. Here is what happened.
At a much lower correlation of .60 it seems that the “Kev Score” does not prove itself to be a strong indicator of fan attendance for the 07-08 season.
Was “The Kev Score” a reliable way to judge if a team would or would not have a strong attendance? Well not really but it worked better than all the other things I tried. See if you are able to discover your own “Kev Score” and help Hockey Team owners around the NHL discover how to bring more fans to their games.
Popularity: 10% [?]














