11 Things To Know About Semantic Web

henning · on Feb 17, 2008

Alan Kay said the best way to predict the future is to invent it.

I wish he would have gone further and said that if you aren't creating the future you have no business predicting it.

iowahansen · on Feb 17, 2008

One single company could affect a global semantic web adoption within 6 months: Google. Once they decide to give webpages with semantic info a higher relevance, armies of SEO experts would swiftly generate vast amounts of semantic information...

nandan · on Feb 18, 2008

Excellent point. I guess the challenge is in first determining whether adopting the strategy to give pages with semantic info a higher rank, a good one. In other words, will this improve users' satisfaction with search?

I suppose it could start out in Google Labs as an experiment. Although I m quite convinced of the power of the semantic web, I would think Google would tread carefully before changing what's working for them, now.

mtraven · on Feb 18, 2008

Google already recognizes embedded RDF metadata that describe Creative Commons licensing, and you can filter on these on their advanced search page. A long way from full semantic web adoption, but it's a toe in the water.

dualogy · on Feb 17, 2008

[ Remember, you heard it here first (or tell me where you heard it before :) ]

Without having read the article (yet), my theory is that the web will become "semantic" in the sense "semaniacs" hope for ONLY when users can generate structured content (other than just photos, articles, comments, forum posts etc) EASILY ... that is: differently from when HTML started out, site owners cannot get such a ball rolling by painstakingly fleshing out RDF mark-up manually. Rather, those web sites will be "semantic" that provide wiki-like editing for data sets.

As an example, imagine a web-based database of restaurants: - Let everyone provide tags for different categories (taste ie Indian/Italian/Regional, style ie luxury/fast food joint/middle-of-the-road, features ie garden/bar/smoking area) - Let everyone update base facts such as address or name changes - Let everyone add photos, media, reviews, comments - Most importantly: let everyone rate/confirm/deny everyone elses contributions - Reward by credits/trust/rankings but where commerce is involved, also by discounts/"miles"/whatever.

Make it easy for websites to do that, or create successful websites that do that so others follow suit should this be what people really want, and you have a semantic web in no time.

If the semantic web is what people want, this will be the only feasible way to create one. If it's not what they want, we will find out soon enough.

dualogy · on Feb 17, 2008

Okay now I've read the article. The point I was making is their bullet point 11. The only one that I thought mattered in this article. Enterprise/platform/integration win rather than consumer win? Nope, consumers decide, they always do...

anaphoric · on Feb 17, 2008

Sorry, semantic web = semantic networks a la Quillian. It's a step backward.

RDBMS work because they have a solid semantic foundation with n-ary relations. Description logic based formalism can't do n-ary, nor can RDF triple stores. The whole adventure is ill fated and us database weenies knew it from the start.

maxwell · on Feb 17, 2008

http://www.w3.org/TR/swbp-n-aryRelations/ ?

anaphoric · on Feb 18, 2008

If you read their description, you can sense that this simple capability is beyond their grasp. They are falling back on reification and complex syntax. They even promise more notes soon LOL!

Here is an example I would like to see Semantic Web people treat. No references to W3C documents, just show us the solution. It's trivial in SQL.

Supplier(supplierId, name,country)

Part(partId, name,price)

Customer(custId,name,country)

Supplies(supplierId,partId,custId)

"Give the American suppliers who supply parts under 10$ to customers in Japan".

I won't bore you with the SQL solution. Now let's see it in OWL! Good luck!

mtraven · on Feb 18, 2008

It's reasonably trivial in RDF/OWL as well; Supplies would be a class whose instances have statements that point to supplier, part, and customer. In rough SPARQL that would be something like:

select distinct ?supplier where

?supplies hasSupplier ?supplier

?supplies hasPart ?part

?supplies hasCustomer ?customer

?supplier hasCountry "US"

?customer hasCountry "Japan"

?part hasPrice ?price

filter(?price < 10)

Not that bad, maybe more verbose than SQL, but not that difficult to express. You'd probably want type constraints on the variables, too.

anaphoric · on Feb 18, 2008

OK, you are right, that wasn't so bad. Perhaps I will take a closer look at SPARQL one of these days.

BTW how does it do with things like spatial constrains. In relational databases it's easy to express spatial joins over relations. For example given

Structure(id, name, x, y, type)

you can query for things like all the structures of type 'pharmacy' within 1 km of a 'hospital'. It's simple to define functions like distance(x1,y1,x2,y2) that may be predicated on. Anything like that in SPARQL (yet)?

darreld · on Feb 17, 2008

Other than the oft-repeated 'machine readable', I'm having a hard time getting over the conceptual hump of what the Semantic Web is.

trevelyan · on Feb 17, 2008

I work with semantic analysis technology (no, it isn't profitable), and think this is a really good summary.

The useful applications long-term are in areas like machine translation (contextual semantic knowledge is important), document categorization and indexing for search (ie. semantic data-scraping). Indexing for search is the simplest of the bunch and the most likely use of the tech that we'll see soon.

RWW has it right that the major challenge this poses to Google is a proliferation of industry specific applications that monetize better information management by virtue of knowing exactly what their users want and figuring out creative ways to aggregate it. Google is particularly weak with multiple languages: their focus on language-agnostic translation tools makes them vulnerable in foreign markets.

Then again, if Google provides the tools to let these vertical portals manage their own on-site advertising, who cares who is doing the actual document indexing/analysis?

leaf · on Feb 17, 2008

It's been a really long time since I looked at this kind of stuff in any detail. The librarians I worked with at university would go on and on about semantic markup, and there were some very dedicated users that seemed to be willing to slog through the SGML nightmare. But, as soon as HTML came out it swamped the other technologies because it was so easy for people.

Has anything changed since the SGML days? Rhetorically, it all still sounds like a bunch of librarians complaining woefully that they can't do search properly if no one marks up their data properly.

trevelyan · on Feb 17, 2008

Nothing much has changed. Most of the talk about the semantic web is complete hot air. It isn't as if this stuff is counterintuitive or non-obvious -- it's just that no-one wants to go through the work of teaching machines to understand language/text without a way to profit from it.

xirium · on Feb 18, 2008

It would great if client application had understanding of context rather than blindly performing layout or having fragile scripts which scrape content. Today, if you access an ecommerce website and retreive a page which describes one or more products then your browser only obtains HTML for layout. The number of products and the fields are not apparent to the software. If the content was explicit then you could do more. For example, you could have a client which could switch to product thumbnails. You could have clients or servers performing meaningful price comparisons on more products.

For event listings, information could be added to calendar software more easily. For social networking, contacts could be added more easily. The central idea with semantic web is that layout isn't the final use of information.

michaelneale · on Feb 17, 2008

Semweb solution is to add annotations to everything (tags), so its obvious to a machine what the information means (ie, the content providers/authors do more work up front to provide more automation later). Versus the crawl the web and index approach with unstructured data (which google are very very good at, clearly).

I see it a bit like the static/dynamic typing "disagreements" people have (heck I disagree with myself, frequently).