A Quick Test Drive of Google Table Fusion
June 12th, 2009by Sean Gorman
Google Labs launched a cool new app yesterday called Google Table Fusion Although, the name is a bit confusing since there is also the Google Earth Fusion Server. Anyways, it is a great step forward into the wide world of structured data for Google. This seems to be becoming a trend following the launch of Google Public Data a little while back. The future of structured data on the Web is probably a whole discussion in itself, as well as what Google’s entrance means, but I’ll keep to just the new app for the sake of brevity.
Google Table allows users to upload up to 100mb .csv or .xls file and then allows filtering, merging and visualization of the data. Each of the basic options has lots of nice functionality. Filter allows you do some basic operators like > “greater than” < "less than" >= “greater than or equal to” < = "less than or equal to" against any single numerical attribute in the table. You can also merge two tables together either in total or by attribute. Lastly, Google Table provides a variety of data visualization options - map, intensity map (i.e. thematic map), line graph, bar chart, pie chart, scatter plot, motion (Gapminder's stuff), and timeline.
With that as a high level overview I'll dig into geographic capabilities a bit, since we tend to like maps. On the plus side this is the first I've seen from Google directly linking structured data with maps. Google Charts has had thematic maps for a while, which is leveraged for the thematic mapping in Google Table. Also Google Table has some georeferencing capabilities that auto-recognizes geographic names and georeferences them on the fly.
As MapperZ points out Google Table has "fast batch geocoding" capabilities- except you can only look at the geocoded files 200 at a time and not get the georeferenced data out of the system. It appears that Google is georeferencing a set on place names and addresses - from what I can gather this includes country names, city names (domestic and international) in addition to doing traditional street address geocoding. The georeferencing and geocoding appears to only work in Google Maps unless it is country names. If you have country names then there is an option to visualize the data in Google Charts. This can cause some issues - for example look at this state GDP data I uploaded. The first pic is data geocoded by clicking on the "Map" option:
For some reason the geocoder missed Washington state – looks like it confused the Washington state with Washington DC.
The problem becomes even more severe with Google Chart’s georeferencing where it thinks the state of Georgia is the country of Georgia since that is the only country name it recognizes:
These are problems all geocoders run into, but I think it is endemic of where Google Table comes up short. It is a bit of an application Frankenstein where it does many things but none of the very well. Several different Google apps glued together, but without the cohesive work-flow and technology to make it work as a platform. The integration and capabilities for data search, metadata, handling data other than .csv, large data visualization,and ease of use will all be keys for success. Also in the negative category, the big disclaimer, any data you upload into Google Table incurs the dreaded section 11 clause:
By submitting, posting or displaying the content you give Google a perpetual, irrevocable, worldwide, royalty-free, and non-exclusive licence to reproduce, adapt, modify, translate, publish, publicly perform, publicly display and distribute any Content which you submit, post or display on or through, the Services.
Criticisms aside it is pre-alpha and a “Labs” project. There are many cool innovations like commenting within data sets. Also I think Google Table is a trend we’ll see more of where the geographic components of data is just another option in a larger system and not a system in and of itself. The overall project points towards the big potential in building a cohesive structured data platform, and it will be fun to see which of the various efforts in the ecosystem wins the day.
Popularity: 14% [?]
Dataset of the Day: The New Digital (TV) Divide
June 12th, 2009by Bill Greer
Time to Switch over to Digital TV from the old analog system. This has been coming for a long time and its finally here. From the official DTV goverment website, “The switch from analog to digital broadcast television is referred to as the digital TV (DTV) transition. In 1996, the U.S. Congress authorized the distribution of an additional broadcast channel to each broadcast TV station so that they could start a digital broadcast channel while simultaneously continuing their analog broadcast channel. Later, Congress set June 12, 2009 as the final date that full power television stations can broadcast analog signals. As of June 13, 2009, full power television stations will only broadcast digital, over-the-air signals. Your local broadcasters may make the transition before then, and some already have.”
So who is ready for the switch and who isn’t? We made a few maps based on the Neilson report showing who was ready and who will be left behind in analog. Check the maps out below.
Click on the Eye Icon to turn layers on and off.
To download the data or view the meta data visit Finder here!
Popularity: 12% [?]
Creating the National Broadband Map for $3.5 Million Instead of $350 Million
June 9th, 2009by Sean Gorman
There has been a bit of hub-bub about the provision in the stimulus (ARRA) plan to map broadband in the United States and the $350 million price tag. Specifically the statute calls for “$350,000,000 shall establish the State Broadband Data and Development Grant program, as authorized by Public Law 110-385 *and* for the development and maintenance of a national broadband inventory map as authorized by division B of this Act.”
In a response to an All Point Blog podcast on the topic Archie Belaney said he thought FortiusOne could accomplish the same task for $3.5 million. With gauntlet on the floor it got me thinking about how you would actually pull it off in an economical fashion. It has been a few years since I’ve been deep into the weeds of broadband mapping, so some of my thoughts may not be completely aligned with the status quo.
One of the big challenges, often missed, is that many broadband providers do not know where their fiber is. Large amounts of data still resides in paper maps – in a 2003 court case involving the Maine Public Utilities Commission an expert witness for Verizon states that standard operating procedure for restoring a failed line is to locate the lead engineer for the region, consult paper maps, manually identify an alternate route, and send technicians to wire jump around the outage. The point being that in many cases there is not a nice clean digital database of broadband deployments and routes.
So to build a national broadband map one of the biggest challenges is data collection and aggregation. Therefore the tools in GeoCommons to upload existing digital data is going to be insufficient. To fill this gap there are two key capabilities required digitizing paper maps and collecting data from the field. To solve both problems I think there is a good bit we can borrow from the OpenStreetMap (OSM) project. Broadband fiber lines all have to follow right-of-ways along streets, railroads, pipelines etc. The late 90′s saw a boom in companies whose business models were just finding new right-of-ways to lay fiber across. Mapping these right-of-ways is no different than the streets, railroads, cycle ways and footpaths mapped in OSM.
Since the goal here is to map where broadband is deployed so we can better gauge “universal access” the accuracy of commercial GPS should more than get the job done. Data collection could easily follow the OSM model and a server could be set up inexpensively and leverage any GPS device for data collection. This covers one use case where the data needs to be collected from the field, but is likely to only be a small percentage of data that needs to be created. The larger project is likely to be digitizing paper maps, which I believe is yet another task that can be easily crowdsourced (even if the crowd is not the public).
A good deal of the data in OSM does not come from GPS data from the field but instead from users tracing roads from satellite imagery. This same approach could be used to easily digitize existing paper maps of broadband deployments. To enable this you could leverage Chippy and Schuyler’s Map Warper to pull paper maps into OSM, allow them to be traced and entered into a common system.
I think the combination of OSM and Map Warper would cover the big data creation and conversion issues. You could use GeoCommons to allow the various stakeholders to contribute their existing spatial data on broadband deployments to the system. You could also allow stakeholders to create maps of broadband coverage identify identify need based gaps by overlaying demographic data. Below is an example of cable broadband lines in Vermont overlaid with population count with income between $10-20k.
When you filter the data to locations with more than 150 people in the $10-20k range you can begin to see the mismatch between broadband service and low income population.
While I’m not sure of the exact cost breakdown for what is proposed above, you could deploy everything to EC2 and probably cover software/hardware for half of the 3.5 million. Then cover the engineering labor needed with the other half. Data collection/migration costs would be the biggest variable with lots of possible options – Americorps, volunteers, offshoring (probably unpopular), or leveraging various job works programs. The good news is that it would not require a very technical workforce to accomplish the tasks. It would be a cool project but sadly we’ll probably spend the whole $350 million and still not have a decent product to show for it. On the upside some of the concepts proposed are already being adopted by a few telcos who are trying to crowdsource their coverage maps. RidgeviewTel created a Web site where the public can drop points for where they can’t receive broadband service.
Regardless of how it happens once the data is aggregated we could make maps like these for the entire country:
Or do calculations of broadband deployment concentrations like this (also NYC):
Popularity: 11% [?]
Dataset of the Day: North Korea Uncovered
June 8th, 2009by margot
Curtis Melvin‘s North Korea Uncovered gives the public a glimpse into a country typically shrouded in mystery and regulation. The Google Earth project displays information about anything from food and culture, to concentrations of mining operations and prisoner encampments, and even shows the locations of prisoner’s barracks and guard dining halls. Never before has the public been able to take such a look at North Korea thanks to Melvin’s efforts to harness the power of open-source data and mapping utilities.
We’ve broken down the original KMZ file into several kml files and uploaded them into Finder! to showcase just how much information has been collected about the country on various topics. Here are some samples of maps we’ve made from the data provided by Melvin’s project.
Another great way to view the data is to simply download the data as a csv from a layer in Finder!, sort based on any attribute and then make a map of that data. I made a csv of lighthouses from the transportation file. I sorted based on names with lighthouse in them, then saved as a new csv and uploaded it into Finder!
See what things you can come up with using Finder! and Maker! and enjoy the wealth of data available on Geocommons and uncovering North Korea for yourself.
Please visit Geocommons to view any of the maps/layers discussed above.
Lighthouse map: http://maker.geocommons.com/maps/5784?page=
Lighthouse layer: http://finder.geocommons.com/overlays/13232
Gulag map: http://maker.geocommons.com/maps/5776?page=
Gulag layer: http://finder.geocommons.com/overlays/13208
Agriculture/Horticulture map: http://maker.geocommons.com/maps/5788
Agriculture/Horticulture layer: http://finder.geocommons.com/overlays/13185
Popularity: 8% [?]












