Andrew Turner has a great series of blog posts on the future of KML that were the product of meetings at the OGC on the topic a week or so ago. Lots of interesting content in Andrew’s series, but the one most near and dear to us is the discussion on metadata. Chris made it out to the meeting with Andrew to throw our 2 cents into the discussion, and convey Chris’s thoughts on the schema tag and how attributed data can be embedded into it. We should not confuse adding attribute data to KML to adding metadata to KML as Sean Gillies points out in response to Andrew’s post. Both are important but serve two different and distinct functions.

Our use of the schema tag is to allow additional data to be added to KML to describe a location on the map. Natively KML supports the ability to add a description and Z coordinate to a location. So, you can describe a push pin with text, HTML and/or a picture then add a Z coordinate that provides a metric to that push pin. This allows you to do many things and has created a lot of great KML, but there are limits. Namely you can only really add two attributes – a description and a metric. Lots of locations descriptions and data in general is multi dimensional.

Lets take a simple example of one of the first Google “My Maps” mashups of the 2004 US Presidential Election. The election mashup is a nice thematic map of Bush (red states) versus Kerry votes (blue states), and when you click on a state it shows you the percent of votes for each candidate. The data on the percentage of votes for Bush and Kerry is placed in the description field of the KML requiring the user to color code each state to create the thematic map. This is quite a bit of work since your are using a qualitative data field to try and do something quantitative.

This is something we would like to change, by making it a lot easier for anyone to create KML that easily handles quantitative data. The geoweb, to date, has done a great job of opening up mapping by allowing anyone to create a qualitative description (text, HTML, pictures) of a location. This is what KML is currently geared to support, but there are an increasing number of people that would like to expand quantitative data beyond a single Z attribute.

In his post Andrew pointed to our use of the schema tag to enable thematic mapping, and that is accurate, but only the tip of the iceberg of what is possible. Once you have access to multiple data descriptors about a location it enables a range of decision making tools. KML currently reflects the “read – write” functionality of Web 2.0, but in order to evolve to a “read-write-execute” web it will need the ability to support quantitative functions that allows users to be enabled by decision support.

Since things are always clearer with examples and our favorite example is finding bars and single (men/women) let me give it a shot. Currently we would search for bars and get back KML that describes the bar – name, address, user comments, maybe a user rating. The KML and current applications cover this very well – we can “read” and “write” back to the KML – very Web 2.0. What is missing is any analysis of those bars that tell me the best one to go to.

Lets say the application already knows a few things about me – I am a 33 years old, single, male, work in IT, and I am a Taurus. This information and much more could be easily picked up from a social network profile like Facebook or MySpace. If I now did a search on bars and the KML had embedded feature attribute data for the bars and the surrounding contextual data I could be directed to the bars that had the highest correlation with women that are single, in an adjacent age bracket, and work in IT. If I had a good experience at the bar I could post back my comment to the bar further reinforcing that quantitative correlation with user generated validation. Now my KML has enabled a “read-write-execute” application that is both qualitative and quantitative. That I believe is the long term value proposition for KML 3.0.

Popularity: 18% [?]

Imagine a situation where as many as 25 million people in the US could go to jail for possession of illegal substance.
On average 10% of U.S. population of age 12 and above smokes pot. If these many people were affected by an infectious disease such a West Nile or Avian flu, it would be a national emergency. And yet we hardly pay attention to this huge but hidden (mental) disease of substance abuse that affects a vast portion of the population.

Smoking pot is a Federal offense and along with other substance abuse issues Feds take this very seriously. A person smoking pot for so called medical use could land in slammer even in states (
Alaska, California, Colorado, Hawaii, Maine, Nevada, Oregon, Maine, New Mexico and Vermont) that allow
medical use of marijuana.

The Supreme Court ruling of 2005 made clear that state laws such as California’s Compassionate Use Act provide no defense against legal action by Drug Enforcement Administration (DEA) and other Federal agencies.
Recent raids on the likes of Medical marijuana dispensaries have stirred up quite a storm among advocacy groups.

So how serious is this substance abuse problem?

It’s estimated that marijuana use among age 12 and over
ranges from as low as 5.6% in places such as Utah County, UT to as high as 19.7% in Washington County, RI.

According to the Office of Analytical Service (OAS), the analytical arm of relatively unknown Substance Abuse and Mental Health Administration (SAMHA) of US. Dept. of Health and Human services (HHS);
193 out of 347 substate treatment planning areas in the U.S have at least 1 out of 10 people over the age of 12 using marijuana.

Below is a split map of North-East and Western states. It was generated by FortiusOne Inc‘s data team based on the OAS‘s tables. Explore other parts of the lower 48 with pan and zoom.

North-East Marijuana use




The 193 substate (treatment planning) areas cover California, Colorado, Arizona, Michigan and
large areas of Florida (central Florida), Texas (Austin, Houston), Massachusetts, New York as well as smaller states such as Connecticut, Vermont and Maine. Only state of Utah and Western parts of Kanasas have less than 6% of its population using marijuana. The remaining 150 or so substate regions have between 6 to 9 percent of its 12 year and old population smoking pot.

Marijuana use west of Mississippi




So, does it matter if a certain celebrity stays in jail or not? Perhaps not. But it matters when John and Jane Does of the world are involved in substance abuse. It matters a lot more if they happen to be minors

And now, it certainly matters, since the legalization of medical use of marijuana has become a political issue. So what’s your opinion?

Popularity: 13% [?]

The maps below show individual donor contributions by zipcode areas to leading presidential candidates. The data for the maps are derived from the quarterly finance reports filed on July 15, 2007 with Federdal Election Commission (FEC).

Pan and zoom in to investigate the patterns of contributions at subregional levels in each of these maps. The list before each map shows the five zipcode areas that contributed the most to each of these four leading presidential candidates during the last quarter. Also provided, just below each map, is the list that shows the five zipcode areas that gave the most to these four leading presidential candidates, since January, 2007. The figures next to zipcodes show the approximate $ amounts raised.

We at Fortiusone are in the process of geocoding addresses of individuals who contributed to candidates of their choice. Some of these maps will be posted to Geocommons over the next several days. Hope you will come back and look for a blog post on the new campaign finance maps as well.

Funds raised by Clinton campaign

The top five zipcode areas by amounts: 2nd quarter, 2007

1. New York, NY 10021 (~$550k)

2. New York, NY 10022 (~$335k)

3. New York, NY 10023 (~$300k)

4. Washington, DC 20016 (~$280k)

5. New York, NY 10128 (~$210k)




The top five zipcode areas by amounts: 1st and 2nd quarter combined, 2007

1. New York, NY 10021 (~$950k)

2. New York, NY 10022 (~$615K)

3. New York, NY 10023 (~$588k)

4. Washington, DC 20016 (~$541k)

5. New York, NY 10128 (~$381k)

—————————————————————————————–

Funds raised by Obama campaign

The top five zipcode areas by amounts: 2nd quarter, 2007

1. Chicago, IL 60614 (~$365k)

2. New York, NY 10021 (~$333k)

3. Chicago, IL 60610 (~$246k)

4. Washington, DC 20008 (~$201k)

5. Washington, DC 20016 (~$198k)




The top five zipcode areas by amounts: 1st and 2nd quarter combined, 2007

1. Berkely, CA 94709 (~$1,765k)

2. Chicago, IL 60614 (~$528k)

3. Cambridge, MA 02238 (~$474,513

4. New York, NY 10021 (~405k)

5. Chicago, IL 60610 (~348k)

—————————————————————————————–

Fund raised by Giuliani campaign

The top five zipcode areas by amounts: 2nd quarter, 2007

1. New York, NY 10021 (~$437k)

2. New York, NY 10021 (~$278k)

3. Greenwich, CT 06830 (~$169k)

4. New York, NY 10128 (~$161k)

5. New York, NY 10028 (~$152k)




The top five zipcode areas by amounts: 1st and 2nd quarter combined, 2007

1. New York, NY 10021 (~$1,050k)

2. New York, NY 10021 (~$625k)

3. Greeenwich, CT 06830 (~$351k)

4. Palm Beach,FL 33480 (~$342k)

5. New York, NY 10028 (~$334k)

—————————————————————————————–

Fund raised by Romney campaign

The top five zipcode areas: 2nd quarter, 2007

1. Greenwich, CT 06830 (~$125k)

2. New York, NY 10021 (~$109k)

3. St. Louis, MO 63124 (~$71k)

4. Provo, UT 84604 (~$65k)

5. Birmingham, MI 48009 (~$59k)




The top five zipcode areas by amounts: 1st and 2nd quarter combined, 2007

1. New York, NY 10021 (~$148k)

2. Greeenwich, CT 06830 (~$145k)

3. Provo, UT 84604 (~$127k)

4. Salt Lake City 84108 (~$104k)

5. Bloomfield Hills, MI 48304 (~$96k)

Popularity: 17% [?]

OK – the title is a bit over the top sensationalistic, but the metadata debate opens up the larger topic of technology being used to increase participation. There is a long history of technology increasing participation – the PC Revolution with the microcomputer, word processor, spreadsheet, etc – Web 1.0 with online auctions, web home pages, online communities, etc. – Web 2.0 with blogs, social networks, citizen journalism etc. If you really wanted to push the argument you could go back to the assembly line, the steam engine, or really stretch it back to crop rotation. I’d argue that the real power of Web 2.0 has been the democratization of participation through technology. Blogs are allowing anyone to have a voice – participatory media sites like Digg, Newsvine, StumbledUpon, Furl are allowing the public to vote what is news – self broadcasting platforms like YouTube, Vimeo, Blip.tv will put anyone on TV – participatory office applications like Writely and Google Spreadsheets are all changing the face of how the public interacts with technology and each other.

Mapping has very much been a part of this story, with Google Earth/Maps, Microsoft Virtual Earth, Yahoo! Maps and new projects like Open Street Maps all playing a role. In fact it was mapping applications that kicked off the mashup phenomenon with the combination of Google Maps and Craig’s List rental listings. Not surprisingly participatory mapping mashups sprung up in short order with innovative sites like Platial, Tagzania, Frappr and others. In these applications anyone could create a location on a map and tag it with social information like photos or descriptions about why they created it. These efforts were very much in the Web 2.0 model of mass participation where anyone could contribute information. For the most part, though, the data was fun and not what the GIS world would consider substantive. Sometimes this movement is called neogeography, web mapping, or a leading part of the larger geoweb.

In the GIS world it is a much different model where a small number of highly trained professionals have access to data and tools with which they render maps to be distributed to everyone else. As technology has advanced these maps started to be delivered to web browsers and have some interactivity. The model always remained the same though – professional gate keepers that brokered knowledge out to the masses. As Google and other mapping applications API’s have proliferated, the worlds of neogeography and traditional GIS have begun to intersect. Now the major GIS vendors are offering API’s to their technologies and there are new more dynamic ways for maps and information to be delivered. While the new technologies coming from the GIS vendors all have the right buzz words they still work on the very same model. A small group of trained professionals acting as gate keepers to the masses – whether their maps are delivered to you as piece of paper or a rich media Ajax application.

This is the crux, I believe, of the metadata debate. Let’s be honest adding a metadata link to a system like ours or anyone else’s is not really the issue. Adding in the link is not so tough and we’ll figure out an effective way to link to metadata if it is there. The issue is opening up geographic data and analysis tools to the masses. Metadata is a convenient barrier to entry as is the expense of software, training, and infrastructure to even get your foot in the proverbial geospatial door. The big goal of GeoCommons is to break down those barriers, so that geographic data and analysis can become accessible and participatory to everyone. I think that technology inexorably moves in this direction, but in my mind that is not why it is crucial to open up geographic data and analysis. The vast majority of geographic data is a public good. It is paid for and created by governments and nongovernmental organizations (NGOs). The mission of the data creators is to have the data readily available and consumable by the public, because they are inherently the ones that have paid for it. Yet we have a huge middleman that has grown up between the public and the data. A middle man that requires you to buy software, take training classes to use it, and support their ecosystem in order to access and consume the data. This ecosystem has in turned created a profession of people who have taken the courses, put in the time, and understands the often complicated world of geographic data and analysis. Neither the ecosystem nor the profession wants to see that cozy arrangement disrupted. Yet that is exactly what we are on the brink of.

Don’t get me wrong I am not advocating the end of Geographic Information Science or Systems. There is sophistication in the discipline that will never be comprehensible to the masses and that will always be the case. I spent way too long in grad school trying to sort it all out to have delusions that my Mom is going to be computing Voronoi tessellations. There are great things that the GIS world has and will continue to contribute, but it should not be an all or nothing monopoly. I do believe that access to geographic data and simple analysis tools should be made available to everyone, and I should not have to jump through the ridiculous barriers of entry to consume the data my tax dollars have already paid for. That all said there is an incredible amount of work that needs to be done to make this happen. We may or may not figure it all out, but we’ll push the ball forward and I’d put all the money in my piggy bank on the model changing through one innovation or another.

Popularity: 9% [?]