Slashdot Mirror


Super-Fast RDF Search Engine Developed

The Register is reporting that Irish researchers have developed a new high-speed RDF search engine capable of answering search queries with more than seven billion RDF statements in mere fractions of a second. "'The importance of this breakthrough cannot be overestimated,' said Professor Stefan Decker, director of DERI. 'These results enable us to create web search engines that really deliver answers instead of links. The technology also allows us to combine information from the web, for example the engine can list all partnerships of a company even if there is no single web page that lists all of them.'"

12 of 144 comments (clear)

  1. Links! by SolitaryMan · · Score: 3, Insightful

    These results enable us to create web search engines that really deliver answers instead of links.

    I need both: answers *and* links! Many times when I search the web, I don't know for sure what am I searching for, let alone being able to ask specific question...

    --
    May Peace Prevail On Earth
  2. Great!! by Anonymous Coward · · Score: 0, Insightful

    Now all we need to do is get everyone to start using RDF.... wait.. you dont even know what that is??

  3. Hype by gvc · · Score: 4, Insightful

    users should get more relevant results


    Yet another /. article parroting an uncritical popular press account of a press release.
    1. Re:Hype by StefanDecker · · Score: 2, Insightful

      We have a Technical Report available at http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf that should answer most of the technical questions. From the abstract: "We present the architecture of an end-to-end search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers. We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements."

  4. Next up: Ontology spam by G4from128k · · Score: 5, Insightful

    Yes, creating a consistent ontology is challenge. But the bigger challenge is the lack of incentive for ontology truthfulness. If this type of search becomes popular, ontology spam and OSEO (Ontology Search Engine Optimization) will become a booming industry.

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:Next up: Ontology spam by treeves · · Score: 2, Insightful

      Ontology SPAM is OK, but Epistemology Spread is really yummy!

      --
      ...the future crusty old bastards are already drinking the Kool-Aid.
  5. Cannot be overestimated by stevenp · · Score: 4, Insightful

    - "The importance of this breakthrough cannot be overestimated"

    The importance of any event can be overestimated and quite often is overestimated. It is called hype.
    When speaking of XML, XHTML and semantic WEB then the word "overestimated" fits just nice.
    If this was not the case then HTML should long have been dead and the whole WEB should have been based on pure XML with meaningful tags.

    -- Do not read me, I am a stupid tag

  6. Re:This could be huge by Anonymous Coward · · Score: 1, Insightful

    Except for the minor little problem of getting everyone to agree on the ontologies. Being able to search quickly is important, but until somebody comes up with the Dewey Decimal System for all knowledge, it won't mean much. How about the Dewey Decimal System?
  7. Re:This could be huge by complete+loony · · Score: 3, Insightful

    Ah, but the Dewey Decimal system only works because responsible people are involved in categorizing everything. They let just anyone publish information on the internet these days.

    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
  8. TMA: Too Many Acronyms by EccentricAnomaly · · Score: 2, Insightful

    Why assume everyone knows your acronyms. To me RDF means "Reality Distortion Field". Zeesh, 7 billion triples or whatever.

    --
    There are 10 types of people in this world, those who can count in binary and those who can't.
  9. Re:Here's the Tech Report by $RANDOMLUSER · · Score: 3, Insightful

    You are too modest. You're the lead author. Congratulations on a first-rate contribution to mankind. And such a young pup, too.

    --
    No folly is more costly than the folly of intolerant idealism. - Winston Churchill
  10. RDF is a bad idea by Zarf · · Score: 1, Insightful

    I just read the basics of RDF and I can see that this could be a really really bad idea. If RDF is intended as an internal data representation for a search engine company to use then this is great. The search engine company or your own company's search engine staff can police and audit your RDF data. However, if I'm reading this right RDF is *supposed* to be populated by *volunteered* data. As such you're going to suffer not just the Wikipedia effect but all the problems seen in MetaData from an internet generation ago.

    You'll see RDF associations linking the president to a crass picture of a donkey or a goat of some kind. You'll see companies set up to deliberately poison RDF data with false links designed to drive traffic to a site... you'll see sock-puppets and all kinds of other attacks.

    This whole effort reminds me of the "this is spam" bit that was proposed to stop spam. You can't expect spammers to say to themselves, "wait, I better flip the this-is-spam but to true before I send this" you also can't expect people to not abuse the RDF system in similar ways.

    Don't expect that if you RDF search for Stephen King that everything that comes up was actually posted by him. Imagine the pages that would get attributed to the president or Mr. T as a prank... the information would only be useful if you could verify the document as legitimate first.

    The "is part of" feature is the most likely target of abuse I think. I could say that everything I wrote is part of the New York Times or as part of some official document that gets searched for often. The result would be erroneous hits in RDF search and artificial authority for my crack pot theories.

    --
    [signature]