Slashdot Mirror


Using the Semantic Web to Enhance Search

RobMcCool writes "At Stanford KSL, we really like the Semantic Web. So we've taken many of our favorite web sites, scraped them, and put together a huge pile of RDF, which we'll let you download. We've used that RDF to create a search application, in the spirit of Google Q & A or Microsofts recently announced MSN Search extensions. Our search can answer simple factual queries like the previously discussed population of Portugal but can also answer some more complex ones. We also have a smart autocomplete system, type "tom hanks birth" slowly to see it in action (best with Firefox). We're looking for people to be a part of this search system by running their own search sites, and by putting their data on the Semantic Web. Come check it out!"

6 of 150 comments (clear)

  1. This won't work by holyshitholyshit · · Score: 2, Interesting
    Firstly scraping is the same as what google does, which is fine but only a fool would trust the scraper not to censor their output.

    Secondly, scraping doesn't always work and you will surely have low-grade porno and get rick quick schemes/scams littering your sematic data.

    But let us suppose that the main benefits of a semantic web are (A) access to reference data [which may be falsified, oops], and (B) access to product availability data [which may be falsified, oops, like mail order companies that pretend they have something in stock but don't and yet still charge your credit card].

    It's just won't work.

    It will always be a rough approximation of reality.

    It's just a way of bad way of caching the results of scraping.

  2. A tale of two technologies.... by Crimson+Dragon · · Score: 3, Interesting

    The Semantic Web appears to be a budding server-side solution to the paradigm of information glut online. Social bookmarking appears to be a client-side solution to the paradigm of information glut online.

    It is refreshing to see exciting new solutions to the problems we have at present of targeted information retrieval on the internet. I can remember years of stagnation in this field (read: early 90's), and any change from today's google-and-pray searching mentality among the majority of end-users will be welcome.

    --
    The Crimson Dragon
  3. My question by News+for+nerds · · Score: 4, Interesting

    Does it have a countermeasure against 'semantic spam'?

    1. Re:My question by smartdreamer · · Score: 2, Interesting

      There is no such thing as semantic spam. What you refer to is desinformation or information junk. Like the actual web, semantic web is about freedom, openess and accessibility. So, everybody can publish (I don't refer to governement laws, repression, etc.). But semantic web has a solution to this wave of information in a thing called the web of trust which propose giving trust ranking to information and introduce inference engines to compute which links/sites may interest you and why. But this is not for today. ;)

  4. Slashdotting Google bomb? by bcmm · · Score: 2, Interesting

    That second link goes to http://www.google.com/url?sa=U&start=1&q=http://ww w.w3.org/2001/sw/&e=9707
    How is that different to linking to http://www.w3.org/2001/sw/?

    Is Slashdot trying to improve someone Google ranking?

    (Also, did Slashdot always linkify URLs entered as plaintext? I didn't write any "a href" for those two.)

    --
    # cat /dev/mem | strings | grep -i llama
    Damn, my RAM is full of llamas.
  5. Re:Google watch out... by ShinmaWa · · Score: 2, Interesting
    However, it does place a lot of demand on the content provider to provide metadata-rich content

    This statement is why I was wondering why this was considered such a wonderful thing. For a while now, there's been a research project at IBM called WebFountain that not only does everything that Semantic Web attempts to do, but doesn't require any special mark up either. Its goal is to work with completely unstructured data of any type, including web pages, powerpoint documents, word docs, PDFs, etc etc. Based on the article I linked above (which is 18 months old), it seems Semantic Web is actually much more primitive.

    More to the point, in this blog there was an arcticle on WebFountain. In the comments section there was this mention of WebFountain in an RDF/OWL environment:
    if everyone were to agree on a tag set and apply it consistently, and tag everything of possible business interest, then yes, WebFountain would not be so relevant...and people would also need to tag for things that they don't even know will be businesses in 50 years [...] We'll see if that pans out!
    To me, that hit the nail on the head and why a markup-based semantic engine is doomed to failure. While the remark was in a business-context, I think its just as valid in any context.
    --
    The /. Effect: Thousands of users simultaneously accessing a site to not read its content.