Are Push Pins Inescapable?

March 12th, 2008by Sean Gorman

It is only fitting that the day after I posted “Moving Push Pins Off the Map” I saw the post on Ogle Earth about a new geotagging icon….which is?

geotag-icon

A GIANT PUSH PIN!

With my interest peaked we did a little digging and found another geotagging icon:

geotag-icon2

ANOTHER GIANT PUSH PIN (actually when I dug into it this icon was a first version that evolved into the red one.)

I of course blame this all on the Google monolith for perpetuating push pin mania. Last time I saw Mike Jones he even had a push pin tie tack. Joking aside the reason for creating a geotagging icon itself is worth discussing.

The stated purpose on the GeoTagIcons.com website is “The Geotag Icon is intended as a web “standard” icon for identifying geotagged content to humans.” So, if a photo or blog post has been geotagged then there is an icon on it to let you know. The thought being many times geotags are hidden in microformats or the URL, thus not visible to the user.

This seems like a straight forward approach to the problem, but also seems to have overlap with existing icons such as KML and GeoRSS. The tutorial on GeoTagIcons has examples of using it for links to both KML and GeoRSS content. This could lead to some ambiguity and confusion for users.

One of the most interesting parts of the pitch for using the GeoTagIcon is, “Reason 4: It encourages development of the semantic web”. On first blush this got me excited, but reading a bit deeper realized they meant it acts as an advertisement for linked content that could help support an evolving semantic web. This is in and of itself is a worthy cause and advertising has been directed at far less useful goals.

The link between geotagging and the semantic web does bring up a good topic for debate. How will all these geotagged objects (KML, GeoRSS, geo-microformats, GPX, etc.) be tied together in a method that creates semantic meaning? What questions will the semantic technologies answer? The GeoTagIcon site provides an example of , “Show me a plot of other bloggers in my vicinity”, or “I’d like to see a map showing which of my friends have also visited Australia”, “Who else has photographed this location?”, etc.

While these are interesting I think the examples and the direction many folks are taking geotagging misses the real potential of the semantic web. The geotagging premise is based on doing increasingly sophisticated things with geo-coded annotations - 99% of the time taking the form of a pushpin. In each of the examples above users or a screen scraper and geo-coder (most likely) have added a latitude and longitude to a piece of unstructured data (bloggers, my friends, photos). While this all useful information it is often relegated to only answering trivial questions.

There is only so much you can do with a bit of unstructured text or html that has geographic coordinates. You can measure vicinity (bloggers nearby), intersection (friends that have visited Australia) and union (show me all photos from a location). There might be a few that I am missing but it is fairly small universe of questions that can be answered, and the semantic web is all about answering questions. Hopefully a very large universe of questions.

From my limited perspective the semantic web is all about bringing vast data resources to the web in an easy and intuitive way. While turning unstructured text into geocoded annotations already on the web is important I think the bigger challenge is blending existing structured data (largely in databases and not on directly on the page web) with organized unstructured data through the web in a seamless way like we have for text, pictures and video.

Metaweb has done some compelling work with Freebase. They have even been doing some interesting geo work with their database. To date Freebase has largely been working with conceptual data, but from the look of their GIS app could be getting into more quantitative data.

As you get into quantitative data the power and tools available for asking sophisticated questions increase exponentially. Unfortunately so do the technical challenges, both computational and creating an intuitive user experience for something not intuitive to most people - numbers, math, statistics, etc. Despite the challenges I think this is where some of the greatest potential awaits for the emerging semantic web. That said I do think the new icons are quite nice and serve a useful function - despite the push pin. ;-)

Read the rest of this entry »

Popularity: 43% [?]

Fantasy Football Fun: A Look at the Offense Side

January 22nd, 2008by Laurie Schintler

With the Superbowl just a few weeks away, thought it would be interesting to take a look back on the season leading up to the play offs and player performance from a geographical perspective. The first take, summarized in this blog, focusses on selected positions from the offense side: quarterback, running back, tight end and wide receiver. Player performance is measured using total fantasy football points for the regular season (www.fftoday.com) and the hometowns of each player geocoded using latitude and longitude coordinates.

The heat maps that follow show what parts of the country the top performers for each position come from; in more technical terms, the maps are players’ hometowns weighted by fantasy football points. For football trivia buffs, some summary statistics are provided below each map. Top performing team for each position is based on average fantasy football points and top players on total points.

The complete dataset with other player attributes: college/university attended, number of years pro, division and conference, average fantasy football points for the regular season and number of games played can be found at www.geocommons.com.

Wide Receivers

Wide Receivers

Top Performers: Randy Moss (New England Patriots); Braylon Edwards (Cleveland Browns); Terrell Owens (Dallas Cowboys)

Top Performing Team: New England Patriots

Most Common Hometown State: California

Average Years Pro: 4.9

Running Backs

Running Backs

Top Performers: LaDainian Thomlinson (San Diego Chargers); Brian Westbrook (Philadelphia Eagles); Adrian Peterson (Minnesota Vikings)

Top Performing Team: Pittsburgh Steelers

Most Common Hometown State: Florida

Average Years Pro: 4.33

Tight Ends

Tight Ends

Top Performers: Jason Whitten (Dallas Cowboys); Antononio Gates (San Diego Chargers); Tony Gonzalez (Kansas City Chiefs)

Top Performing Team: Dallas Cowboys

Most Common Hometown State: California

Average Years Pro: 4.6

Quarterbacks

Quarter Backs

Top Performers: Tom Brady (New England Patriots); Tony Romo (Dallas Cowboys); Drew Brees (New Orleans Saints)

Top Performing Team: New England Patriots

Most Common Hometown State: California

Average Years Pro: 6.03

Tony Romo and Jessica Simpson

Tony Romo was clearly a superstar this season but a look at his weekly fantasy football points shows a precipitous drop towards the end of the 17 weeks. Have anything to do with Jessica Simpson? Weigh in below.

Points by Week

Popularity: 5% [?]

Myths of Crowdsouring

November 3rd, 2007by Sean Gorman

Figured I would keep on the crowdsourced data theme going with some myths I’ve seen crop up in many people’s perception of crowdsourced data and its reliability. First lets take a step back and look at a definition of crowdsourcing, ” [the] act of taking a job traditionally performed by an employee or contractor, and outsourcing it to an undefined, generally large group of people, in the form of an open call.” The fact this “group” is not paid or under contract leads many to believe what they produce cannot be trusted. I think this general assumption leads to a number of myths about crowdsourced data.

Crowdsourced Data and Official Data are Mutually Exclusive

There is a common perception (especially from traditional data providers) that data comes from an official source and is guaranteed accurate or it is crowdsourced and you have no clue if it is accurate or not. Encyclopedia Britannica articles come from an official source and Wikipedia is crowdsourced. NAVTEQ street data comes from an official source and OpenStreetMaps is crowdsourced. We can trust Encyclopedia Britannica and NAVTEQ because we pay them to provide us an accurate product, but we are not sure if we can trust Wikipedia and OpenStreetMap because we do not have a contract for them and any willy nilly crazy person could enter bad data. The issue is seen in black and white - non-trusted and trusted.

In reality crowdsourcing is a tool to collect data. Sometimes it is an end in and of itself like Wikipedia and OpenStreetMap. Other times it is an enabler - like voting news stories from third party sources on Digg. Digg does not user generate the stories but crowdsources the determination of which stories are most worth reading. More recently Tom Tom has used crowdsourced data to enhance their official base data. Perhaps the greatest potential of the crowdsourcing model is a hybrid working with traditional/official data sets. Not only mixing the two together, but using crowdsourcing to enhance the accuracy and validity of existing official data. For instance a map of toxic dumping sites from the EPA is interesting by itself, but it is imminently more valuable if you can add your own data of the schools, playgrounds, and friend’s houses your kids play at. Secondly, if you would like to add evidence to the map supporting the damage caused by the dumping site or add evidence showing the dumping site has been cleaned up then everyone has better context for the original data set and its validity. In both cases crowdsourcing is being used to enhance existing data and does not stand by itself.

Official Data is Automatically Accurate than Crowdsourced Data is Not

Their is a pervasive myth that if data comes from an official source or has official metadata then it must be accurate. Vice versa if it is crowdsourced it must be inaccurate. The truth of the matter is official data and metadata has inaccuracies and crowdsourced data has inaccuracies. In fact the vast vast majority of data in the world has inaccuracies. To quote Chris (our beloved Heretic Alpaca and CTO), “your data sucks and my data sucks - now that we have that settled we can go do something.” The fact that people think corwdsourced data is inaccurate is truly a good thing because they think about what they are consuming and are looking to see if there are problems. The beauty is that when they find problems they can actually go and fix them. The worst thing about official data is that we blindly assume that everything is perfect and when we find that perfection lacking there is no recourse to fix it.

Metadata is the Panacea

Many a GIS wonk has preached without metadata geographic information is just content. Once there is metadata the professionals have entered the room and all concerns evaporate. When people ask me about metadata in GeoCommons, especially our government customers, I say sure we can include your metadata. We can even make it mandatory to include metadata before inclusion if that is your preference, but just having metadata we do not think is sufficient. Metadata can often be anonymous and there is seldom repercussions or rewards if you are sloppy and quick putting in your metadata or thorough and diligent. When you fuse metadata with a crowdsourcing approach there can now be accountability. I create and contribute the data and that data is attached to me. You can click on the source and you get my profile. If the data rocks - kudos and praise for me, if the data blows - everyone knows I was the slacker who put it in.

Recently I did some digging back into the arguments around FGDC metadata when it came out in the early nineties. The standards was not without criticism and suggestions for improvement (Dutton 1994), “The metadata standard is per force formulated from a producer’s perspective. It is, one assumes, the responsibility of data producers to document published datasets, and there is not much consumers can do other than to offer feedback on the adequacy of the organization, usability and quality of datasets they acquire.” We now have the technological means by which to address what could not be addressed then, yet we are to ensconced in the statas quo and dogma to embrace the opportunity to improve the system.

Crowdsourcing is the Wild West of Data

Crowdsourcing is often conflated with “no rules” or “anything goes”, thus leading to a perception of not being trustworthy. While you can crowdsource with no rules it does not mean you are not allowed to have rules. Further, those rules can result in highly trusted content. Think of academic publishing, one of the most successful crowdsourced experiments of all time. Anyone can submit an article have it reviewed by a group of peers anonymously and published in some of the most trusted publications on the planet. No one pays me to publish an article. I have no economic incentive for the data in my paper to be accurate yet I would trust information from “The New England Journal of Medicine” way before I would anything out of the Encyclopedia Britannica. But…you say…academic journal are written by professionals! Not necessarily true - anyone can submit to an academic journal. You need no pedigree, and articles have been published by undergraduates that have no degree at all. The same kid with a Facebook page and 254 friends. Academic journals are trusted because of the peer driven culture that surrounds them, not economic incentive or accuracy standards that must be adhered to. A crowdsourced system can be highly trustworthy depending on how it is structured and the rules that are put in place. I do believe there is a trade off between the number of rules and requirements and the level of participation and innovation in a crowdsourced system. The more rules and requirements the higher the level of trust, but the less participation and possible innovation. Those that can maximize trust and participation in a crowdsourced application will be those who succeed.

Conclusion

In short I think crowdsourced data and tools often get an undeserved stereotype. People tend to lump it all together instead of looking at opportunities to leverage a new tool to enhance their competitiveness. I think this is often the result of fearful knee jerk reactions. Crowdsourcing does have the ability to disintermediate market places, but those who figure out how to harness that to their advantage will be the ones who succeed. Defensive criticism is usually a sign you are strategically headed in the wrong direction.

Popularity: 8% [?]

The Utility of Maps in Hazard Forecasting

The recent wildfires in Southern California remind of us of just how important hazard forecasting has become in helping to ensure the safety and welfare of the public and the role that mapping can play in the process. Short-term forecasts of fire direction and intensity were pivotal in containment and evacuation efforts; Mapping played a prominent role in generating forecasts and in disseminating and sharing information about potential risk.

The usefulness of maps in visualizing and and generating forecasts extends well beyond the California fire event. In the area of climate prediction, numerous sites provide regularly updated maps of long-term and short-term forecasts of a variety of conditions and in some cases, valuable watches and warnings to the public based on the forecasts.

Some Points for Discussion

While the information that is currently out there provides great utility, there are some limitations in the way that the information is is disseminated and formatted that are worth noting. The points are intended to be food for thought and to get us thinking about how we can increase the value of forecasting even further - particularly in an interactive, web-based mapping environment.

First, forecasts are scattered across multiple websites and even within websites, requiring some effort and time on the part of the consumer to find, extract and process information. The sites and links vary in terms of the information they provide. In terms of fire forecasting, some sites focus on drought conditions, others on smoke generation and yet others on combinations of factors to characteristic future fire potential. The forecasting horizons also vary considerably from site to site.

Second, much of the maps provided on the web are in a “hard copy” format and not in an interactive mode where the user can pan, zoom and perform other functions. Some sites do have map viewers however, they are currently limited in the amount and type of data that can be displayed.

Third, and related to the second point, is that the possibility for “layering” data to create custom maps with richer information relevant to the needs of the user is limited. For example, someone may be interested in seeing if an environmentally sensitive or protected area is in the path of a projected wildfire.

Fourth, there lacks a mechanism for consumers and providers of the forecasts to interact and share information. Interaction could be very useful in understanding forecasts but also in terms of improving current predictive models. In the book Making Climate Forecasts Better, Stern and Easterling write: “The utility of forecasts can be increased by systematic efforts to bring scientific output and users’ needs closer together. These efforts may include both analytic efforts to identify the climatic parameters to which particular sectors or groups are highly sensitive or vulnerable and social processes that foster continual interaction between the producers and the consumers of forecasts.”

Fifth, not all information is publicly available and perhaps it should be? In climate forecasting, having access to the “best” information is in the national interest: it can save lives. And in some cases, the private sector is the keeper of such information. A recent study by ForecastWatch, found that in terms of recent historical forecasting of next day rain and snow, government sites had a 21% greater error rate than some of the private companies that do similar projections.

What Could the Future Hold?

The new web is fertile for the development of a system by which forecasts can be provided to the public in a more usable, digestible and efficient manner. Sites like Geocommons could be a one-stop location for viewing forecasts, such as those related to hazards and climatic conditions. In such an environment, visitors could interact with each other or the producers of the forecasts, discuss the validity of the forecasts or provide additional information to augment the projections, all through a wiki or blog-style environment. They could also create custom forecast maps with overlays of additional information that is of most useful to them for solving a problem, understanding a situation or simply planning ahead.

Popularity: 21% [?]

One of our core missions at FortiusOne is to enable many more people to explore, create, and share maps – to democratize geospatial capabilities. Even in government markets, which have been big users of GIS tools, the expense and technical sophistication required often cause bottlenecks in the preparation of maps. Government customers are also facing critical challenges in making information from the field accessible throughout their organizations – increasing collective intelligence from the edges of the network.

We have teamed with Lockheed Martin to leverage our intelligent mapping services to address these problems in government markets. Lockheed has a long history in the geospatial space and has been very progressive in embracing advanced Web 2.0 technologies such as Intelligent Mapping and Wikis.

I thought it might be helpful to provide an example of the kinds of problems we are addressing with Lockheed. Let’s take a fictitious scenario of a military operation dealing with terrorist attacks in Iraq. Suppose I'm Sergeant Gorman and I've uploaded data on a spree of attacks that my patrol collected over the past week.

lmc_ge_shia_blog_jim

A GIS analyst at headquarters, in reviewing my data along with historical data from the last three years, notices a pattern of increasing Shia activity around Samarra and sends an alert to field units. The alert prompts me to scan for data on attacks tagged Shia and Samarra, where I find a photo from a previous attack that shows one of the locals we had suspected of being a Shia ring leader.

lmc_gc_shia_blog_jim

I post a geo-blog noting that this individual has been suspected of coordinating attacks in my sector. A flurry of responses from other patrol leaders indicates that the same individual has been seen in proximity of other attacks. A GIS analyst at headquarters validates the findings and generates a command report, which results in the order to apprehend the suspect. On our next patrol into Samarra, we locate him and discover a complex cell of terrorist Shia activity in the area.

While the account above is completely fictional, hopefully it conveys the power of democratizing geospatial information throughout an entire organization. The same principles apply to a variety of other environments, such as disaster response, homeland security, and intelligence, where enabling the entire organization to explore, create, and share geospatial information can enhance mission effectiveness. We are excited about the partnership with Lockheed Martin to bring these capabilities to market.

Popularity: 14% [?]