Using the Semantic Web to Enhance Search
RobMcCool writes "At Stanford KSL, we really like the Semantic Web. So we've taken many of our favorite web sites, scraped them, and put together a huge pile of RDF, which we'll let you download. We've used that RDF to create a search application, in the spirit of Google Q & A or Microsofts recently announced MSN Search extensions. Our search can answer simple factual queries like the previously discussed population of Portugal but can also answer some more complex ones. We also have a smart autocomplete system, type "tom hanks birth" slowly to see it in action (best with Firefox). We're looking for people to be a part of this search system by running their own search sites, and by putting their data on the Semantic Web. Come check it out!"
Secondly, scraping doesn't always work and you will surely have low-grade porno and get rick quick schemes/scams littering your sematic data.
But let us suppose that the main benefits of a semantic web are (A) access to reference data [which may be falsified, oops], and (B) access to product availability data [which may be falsified, oops, like mail order companies that pretend they have something in stock but don't and yet still charge your credit card].
It's just won't work.
It will always be a rough approximation of reality.
It's just a way of bad way of caching the results of scraping.
The Semantic Web appears to be a budding server-side solution to the paradigm of information glut online. Social bookmarking appears to be a client-side solution to the paradigm of information glut online.
It is refreshing to see exciting new solutions to the problems we have at present of targeted information retrieval on the internet. I can remember years of stagnation in this field (read: early 90's), and any change from today's google-and-pray searching mentality among the majority of end-users will be welcome.
The Crimson Dragon
The best part is the W3C looks down on the business rules world and openly snubs them. for a long time, the W3C camp snubbed RETE algorithm, claiming RDF graphs are better. Once people saw how horrible RDF engines perform as rule count and data increases, they rushed to hack together junk and label it RETE. Sorry, but you have to first understand RETE to implement it. A clueless bunch of impractical day dreamers.
Does it have a countermeasure against 'semantic spam'?
Note to self. Dreaming about the world tagging all their data isn't going to happen. It takes too much damn time. Semantic driven search using google's technique works. Producing a RDF graph is crap. Nothing to watch here.
That second link goes to http://www.google.com/url?sa=U&start=1&q=http://ww w.w3.org/2001/sw/&e=9707
How is that different to linking to http://www.w3.org/2001/sw/?
Is Slashdot trying to improve someone Google ranking?
(Also, did Slashdot always linkify URLs entered as plaintext? I didn't write any "a href" for those two.)
# cat
Damn, my RAM is full of llamas.
I don't think the evidence on RDF mailing list supports that opinion. Look at the literature in the bookstores about semantic web. If anything, it is full of confusion and the specification is poorly written compared to the HTML and XML specification.
Triplet does not equal (Subject verb object). What the RDF spec describes is closer to Natural Language parsing concepts. There are many similarities between what the RDF describes as RDF Model graph and dependency grammar techniques http://w3.msi.vxu.se/~nivre/research/sdg.html.
Anyone remotely interested in NLP knows the problem is very hard to solve using dependency grammar techniques. Statistical approaches have been shown to perform much better.
Semantic Web is essentially repeating the same mistakes already made in the AI world with NLP. the W3C seems blind to these facts and that's why semantic web is doomed to fail.
This statement is why I was wondering why this was considered such a wonderful thing. For a while now, there's been a research project at IBM called WebFountain that not only does everything that Semantic Web attempts to do, but doesn't require any special mark up either. Its goal is to work with completely unstructured data of any type, including web pages, powerpoint documents, word docs, PDFs, etc etc. Based on the article I linked above (which is 18 months old), it seems Semantic Web is actually much more primitive.
More to the point, in this blog there was an arcticle on WebFountain. In the comments section there was this mention of WebFountain in an RDF/OWL environment:
To me, that hit the nail on the head and why a markup-based semantic engine is doomed to failure. While the remark was in a business-context, I think its just as valid in any context.
The