I just got out of Tim Berners-Lee’s discussion of the Semantic Web at the 2002 MIT eBusiness Conference. As it turns out, I think Dave’s description of what the semantic web concept means is closer to describing it than mine, but mine is a complementary vision. Fundamentally, the semantic web is about giving meaning to raw data present on the web in other formats, such as plain old HTML pages, so that the meaning of a particular piece of data and its relationship to other data on the Web can be understood by machines. The key pieces of the vision are:
- A common understanding that data in the semantic web can be expressed in a subject, verb, object framework
- A common way of identifying what a given piece of data is through applying a unique URI through a framework called RDF
- A set of ontologies that relate different semantic concepts together.
Dave’s example is bang on for the basic concept – “TBL” (subject), “MIT eBusiness Conference” (object), “will be presenting a keynote at” (verb). I think it’s complementary to what Google does. Google knows that something is authoritative because of the link relationships it has—I link to and am linked to by a lot of sites and therefore my articles float higher in the system than a page that isn’t linked to by or doesn’t itself link to anything else.
But that only goes so far. If I’m searching for information on a common noun like jaguar, Google doesn’t know a priori whether the information that it returns to me is about Jaguar the car, jaguar the animal, Jaguar the Atari gaming device, etc. The search engine Manjara at Yale can take a stab at separating the links by clustering the pages based on commonalities in the words on each page. But the semantic web concept gives the author power to identify what he’s writing about by unequivocally expressing the linkage to a semantic definition through RDF.
What about my example? It extends the idea of all this data being ontologically parseable and imagines that data being passed about by Web Services. So if you can express through a URI linkage what you mean by the title of a weblog entry, or the price of a contract line item, then the system receiving the web service call can interpret your request in a more reliable way without having to meet system by system and agree on your taxonomy beforehand.
Alas, I didn’t get to talk to TBL before he was hustled out. Maybe I can get a chance to open a dialogue with him before I graduate and see if I’m on track.