It seems like it is a daily dose of semantic web on the tech blogs of late. Today it was Textwise’s Million Dollar Semantic Hacker Challenge and a few days ago it was Yahoo opening their search platform to support a wide variety of semantic web standards. This has lead to a good bit of proselytizing, mostly in the comments, that this heralds the arrival of the Semantic Web, or Web 3.0 or the Next Generation Web. All of which sounds like the circling of the marketing band wagons.

Unfortunately when the wagons circle everything starts picking up the label - in this case semantic. This is especially dangerous when you have a word like “semantics” that can be defined, so many different ways. Just look at the definition tree created by Wikipedia:

*Semantics is the study of meaning in communication.
*In computer science semantics reflects the meaning of programs or functions.
*The Semantic Web refers to the extension of the World Wide Web through the embedding of additional semantic metadata

More often I see folks labeling things semantic that are really syntax. “Syntax” being the rules to construct and define something like a sentence or line of code and “semantics” the meaning of those rules or definitions. Syntax is fairly easy and semantics are fairly hard, as most folks in artificial intelligence would argue. Even going so far as saying all programming languages other than LISP are syntax and not semantic.

This is a bit more clear with an example. Lets take the Textwise announcement - a technology that will parse plain text on a website or elsewhere and categorizes it to predefined topics. One example in the Techcrunch comments was the following:

input text:
Call us crazy, but we think there are some brilliant minds out there that can find some really amazing uses for this incredibly powerful and scalable technology. Think you’re up to the Challenge? We think you are!

categories (ranked from 0 (worst) to 100 (best)):
Shopping/Health/Alternative/Hypnotherapy/Audio_and_Video 43 Business/Telecommunications/Services/Wireless/Software 33 Arts/Music/Bands_and_Artists/311/Tablature 28
Computers/Internet/Consultants/Research 26 Shopping/Health/Alternative/Meditation/Audio_and_Video 25

The output is really not telling me anything about the meaning of the text just setting up rules to provide categorization. So I would definitely put this in the syntax and not semantic category. I would also say what Yahoo! is doing is really more syntax than semantics although there is the possibility of building truly semantic technologies on top of what they are enabling. They’ve created a set of rules based on rich standards to allow applications to be built. Remains to be seen what will come of it, but in rush of market buzz I think it is easy to miss that building truly semantic technologies is quite hard. Some folks in AI (the Chinese room) would argue machines are not even capable of semantic meaning or understanding.

From this perspective I think we’ll see a lot of people building applications based on syntax that reorganize and categorize content by giving the “page web” a bit of structure. Oddly its like we’ve gone full circle back to DMOZ. While these technologies may be clever and useful I do not think they will fundamentally change the Web. In the other category I think we’ll see a few companies pushing towards something more sophisticated (call it a semantic, implicit, computational web) where new data and services are mixed with existing web content to provide answers to users questions.

Popularity: 16% [?]

The Utility of Maps in Hazard Forecasting

The recent wildfires in Southern California remind of us of just how important hazard forecasting has become in helping to ensure the safety and welfare of the public and the role that mapping can play in the process. Short-term forecasts of fire direction and intensity were pivotal in containment and evacuation efforts; Mapping played a prominent role in generating forecasts and in disseminating and sharing information about potential risk.

The usefulness of maps in visualizing and and generating forecasts extends well beyond the California fire event. In the area of climate prediction, numerous sites provide regularly updated maps of long-term and short-term forecasts of a variety of conditions and in some cases, valuable watches and warnings to the public based on the forecasts.

Some Points for Discussion

While the information that is currently out there provides great utility, there are some limitations in the way that the information is is disseminated and formatted that are worth noting. The points are intended to be food for thought and to get us thinking about how we can increase the value of forecasting even further - particularly in an interactive, web-based mapping environment.

First, forecasts are scattered across multiple websites and even within websites, requiring some effort and time on the part of the consumer to find, extract and process information. The sites and links vary in terms of the information they provide. In terms of fire forecasting, some sites focus on drought conditions, others on smoke generation and yet others on combinations of factors to characteristic future fire potential. The forecasting horizons also vary considerably from site to site.

Second, much of the maps provided on the web are in a “hard copy” format and not in an interactive mode where the user can pan, zoom and perform other functions. Some sites do have map viewers however, they are currently limited in the amount and type of data that can be displayed.

Third, and related to the second point, is that the possibility for “layering” data to create custom maps with richer information relevant to the needs of the user is limited. For example, someone may be interested in seeing if an environmentally sensitive or protected area is in the path of a projected wildfire.

Fourth, there lacks a mechanism for consumers and providers of the forecasts to interact and share information. Interaction could be very useful in understanding forecasts but also in terms of improving current predictive models. In the book Making Climate Forecasts Better, Stern and Easterling write: “The utility of forecasts can be increased by systematic efforts to bring scientific output and users’ needs closer together. These efforts may include both analytic efforts to identify the climatic parameters to which particular sectors or groups are highly sensitive or vulnerable and social processes that foster continual interaction between the producers and the consumers of forecasts.”

Fifth, not all information is publicly available and perhaps it should be? In climate forecasting, having access to the “best” information is in the national interest: it can save lives. And in some cases, the private sector is the keeper of such information. A recent study by ForecastWatch, found that in terms of recent historical forecasting of next day rain and snow, government sites had a 21% greater error rate than some of the private companies that do similar projections.

What Could the Future Hold?

The new web is fertile for the development of a system by which forecasts can be provided to the public in a more usable, digestible and efficient manner. Sites like Geocommons could be a one-stop location for viewing forecasts, such as those related to hazards and climatic conditions. In such an environment, visitors could interact with each other or the producers of the forecasts, discuss the validity of the forecasts or provide additional information to augment the projections, all through a wiki or blog-style environment. They could also create custom forecast maps with overlays of additional information that is of most useful to them for solving a problem, understanding a situation or simply planning ahead.

Popularity: 22% [?]

A few weeks ago James Fee wrote a blog post about a debate he had with Steven Citron-Pousty concerning the usefulness of GeoCommons. From a high level the argument came down to the GIS vs. Neogeogrpahy debate. There were great quotes on both sides like “freaking sweet” in support and “pretty worthless” in the bashing category. Over all we were excited to see that GeoCommons had started a debate in the GIS community. The intent of GeoCommons had originally not been to provide a resource to the GIS community, but to provide access to GIS data and a few tools to the rest of the world. The GIS community always had access to the data and the tools, so I had figured GeoCommons would not even pop up on the radar. In hindsight I think we should be providing mutual resources to each other, so with that in mind here are few thoughts on the topic.

Read the rest of this entry »

Popularity: 10% [?]

Every Time you say Web 3.0 a Start-up Dies

March 28th, 2007by Sean Gorman

A while back Anthony Townsend sent me a funny blog link which had him wearing a t-shirt that said "Every time you say Web 3.0 a start up dies :(" The shirt says it all This quickly became a running joke in the office since we had a Web 4.0 milestone running in Trac for a while. So we got a big kick out of a call a few days ago where someone referred to what we were doing as Web 3.0. The last time I'd really read anything on Web 3.0 was when the NYT wrote an article about it that bloggers had a bit of a field day with. I figured I would take another look into it since we'd been labeled. Going to the Web 2.0 well Wikipedia kicks up: "Web 3.0 is a term that has been coined to describe the evolution of Web usage and interaction that includes transforming the Web into a database, a move towards making content accessible by multiple non-browser applications, the leveraging of artificial intelligence technologies and the Semantic web and three dimensional interaction and collaboration." Lots of articles wax poetic on the issue and conflate it with the Semantic Web as in the Wikipedia definition. The Semantic Web has been around since 1999 or so and is most often associated with the thoughts of Tim Berners Lee. I'd done some research on semantic kind of things back in school and to be honest was put off by the general over complexity of it. Any time that core words to describe your work include things like semantics, ontology, lexicon etc. you are not exactly dabbling in the world of simplicity. Having spent a good chunk of my life in academia I can safely say we do an awesome job of taking simple concepts and making it so that 99% of world has no idea what we are talking about. Yes - post modernists - I'm talking about you. My take is that simplicity forms the roots of what has made Web 2.0 successful. The API's and defacto standards that have really taken off have the common theme of being mind numbingly simple. So, there seems to be a bit of a disconnect with Web 3.0 and conflating it with the 8 years of academic and standards work that have gone with the semantic web, which have created some very complicated white papers and manifestations. The irony with getting labeled Web 3.0 is that what we were describing, at the time, was our attempt to simplify the world of geospatial data so that it could be consumable by non-technical people. To add to the irony there is a whole science of applying semantic web concepts to geospatial data and it is definitely not simple. Traditionally geospatial data comes in a variety of shapes and sizes - point, polygons, polylines, raster formats (satelite imagery, heat maps) etc. Part of the art to geographic science is knowing what geometries to use when - cenus tracts, census blocks, counties, zip codes etc. While this frame of thought matches up well with data formats it does not match up well with the way most people think. People think about locations and attributes or contexts about that location. I live in the Clarendon neighborhood and I associate contexts with that neighborhood like restaurants, parking, crime, housing prices, music, congestion etc. The data that describes those attributes could be a dozen different geometries, but as a user I don't really care. I care about getting an answer to my question in the context of the location I care about - in this case Clarendon. We've been working on an architecture that will provide such a simplification and that along with the various other aspects we've been tying in is what created, at least one, Web 3.0 label. Whether what we are doing is Web 3.0 or not I really have no clue - we are hoping it solves a problem in a simple way for a user. At the end of the day that is what I think will be successful whether you label it 2.0, 3.0 or even, ack, 4.0. What is created needs to be easy and simple not only for the users but for the developers implementing it. While the next evolution will likely solve some of the problems targeted by the semantic web I think the actually technological path will be something far simpler than what is currently being touted. ***All ideas about the new architecture and contexts came from Mookie - a.k.a. Pramakta Kumar one of our lead developers. I simply regurgitate them in some semblance of an idea. The F1 platform for it all is a Chris Ingrassia creation TM.

Popularity: 12% [?]