Super-Fast RDF Search Engine Developed

← Back to Stories (view on slashdot.org)

Super-Fast RDF Search Engine Developed

Posted by ScuttleMonkey on Friday May 4, 2007 @02:27AM from the google-to-buy-ireland dept.

The Register is reporting that Irish researchers have developed a new high-speed RDF search engine capable of answering search queries with more than seven billion RDF statements in mere fractions of a second. "'The importance of this breakthrough cannot be overestimated,' said Professor Stefan Decker, director of DERI. 'These results enable us to create web search engines that really deliver answers instead of links. The technology also allows us to combine information from the web, for example the engine can list all partnerships of a company even if there is no single web page that lists all of them.'"

29 of 144 comments (clear)

Min score:

Reason:

Sort:

Official DERI Website by achillean · 2007-05-04 02:34 · Score: 3, Informative

Here's the link to the official NUIG: DERI (omgwtfbbq) website in Ireland:

DERI
1. Re:Official DERI Website by PDHoss · 2007-05-04 03:23 · Score: 4, Funny
  
  I tried to access that site, and I got a good look at their DERI Error.
  
  --
  ======================================
  Writers get in shape by pumping irony.
This could be huge by $RANDOMLUSER · 2007-05-04 02:35 · Score: 4, Interesting

Except for the minor little problem of getting everyone to agree on the ontologies. Being able to search quickly is important, but until somebody comes up with the Dewey Decimal System for all knowledge, it won't mean much.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
1. Re:This could be huge by complete+loony · 2007-05-04 03:00 · Score: 3, Insightful
  
  Ah, but the Dewey Decimal system only works because responsible people are involved in categorizing everything. They let just anyone publish information on the internet these days.
  
  --
  09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
2. Re:This could be huge by spemen · 2007-05-04 04:07 · Score: 2, Interesting
  
  Actually there is a lot of research being done to get around the need for a 'Dewey Decimal System'. The idea is to analyze relations between terms (names, datatypes, ect.) in an ontology. One could also compare relationships between terms: A child of B, C child of D, and A=B does B==A ?? Please note that these are examples of how terms and ontologies *could* be matched and not necessarily how someone would match terms. http://www.ontologymatching.org/ Also, http://wordnet.princeton.edu/ is a project I think is in the direction of a 'Dewey Decimal System' for knowlege.
Links! by SolitaryMan · 2007-05-04 02:37 · Score: 3, Insightful

These results enable us to create web search engines that really deliver answers instead of links.

I need both: answers *and* links! Many times when I search the web, I don't know for sure what am I searching for, let alone being able to ask specific question...

--
May Peace Prevail On Earth
1. Re:Links! by Red+Flayer · 2007-05-04 03:39 · Score: 2, Funny
  
  RDF could do very useful things, like throwing up a disambiguation question at the top os the results page when you've not made it clear what you want
  It looks like you're trying to search for tentacle porn. Would you like help?
  
  No thanks, I don't need Clippy in my search engine.
  
  --
  "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
Search solved. World hunger next. by 140Mandak262Jamuna · 2007-05-04 02:38 · Score: 3, Funny

Having solved the problem of search, and providing a breakthrough product that has consciousness to what was previously mere series of tubes, now the National University of Ireland announced that it is going to solve world hunger next, may be in three months. Other projects in the pipeline includes cure for cancer and solving full Navier Stokes equation.

--
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Hype by gvc · 2007-05-04 02:42 · Score: 4, Insightful

users should get more relevant results

Yet another /. article parroting an uncritical popular press account of a press release.
1. Re:Hype by StefanDecker · 2007-05-04 04:03 · Score: 2, Insightful
  
  We have a Technical Report available at http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf that should answer most of the technical questions. From the abstract: "We present the architecture of an end-to-end search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers. We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements."
Re:Great!! by $RANDOMLUSER · 2007-05-04 02:43 · Score: 4, Informative

Now all we need to do is get everyone to start using RDF.... wait.. you dont even know what that is??
It's the Resource Description Framework, which RSS is a subset of.

--
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Next up: Ontology spam by G4from128k · 2007-05-04 02:44 · Score: 5, Insightful

Yes, creating a consistent ontology is challenge. But the bigger challenge is the lack of incentive for ontology truthfulness. If this type of search becomes popular, ontology spam and OSEO (Ontology Search Engine Optimization) will become a booming industry.

--
Two wrongs don't make a right, but three lefts do.
1. Re:Next up: Ontology spam by treeves · 2007-05-04 06:59 · Score: 2, Insightful
  
  Ontology SPAM is OK, but Epistemology Spread is really yummy!
  
  --
  ...the future crusty old bastards are already drinking the Kool-Aid.
RDF? by lancelotlink · 2007-05-04 02:46 · Score: 4, Funny

I didn't realize Steve Jobs' Reality Distortion Field was able to be harnessed and bottled in a search engine, or any software for that matter. His abilities are boundless!
I'll prove him wrong by Big+Nothing · 2007-05-04 02:57 · Score: 3, Interesting

"'The importance of this breakthrough cannot be overestimated,' said Professor Stefan Decker, director of DERI."

This is without a doubt the greatest invention in the history of time!

There, I just proved the professor wrong. Muahaha.

--
SIG: TAKE OFF EVERY 'CAPTAIN'!!
1. Re:I'll prove him wrong by StefanDecker · 2007-05-04 07:09 · Score: 2, Funny
  
  OK, I concede. You won.
  Some people can overestimate the importance ;-)
Cannot be overestimated by stevenp · 2007-05-04 02:59 · Score: 4, Insightful

- "The importance of this breakthrough cannot be overestimated"

The importance of any event can be overestimated and quite often is overestimated. It is called hype.
When speaking of XML, XHTML and semantic WEB then the word "overestimated" fits just nice.
If this was not the case then HTML should long have been dead and the whole WEB should have been based on pure XML with meaningful tags.

-- Do not read me, I am a stupid tag
TMA: Too Many Acronyms by EccentricAnomaly · 2007-05-04 03:05 · Score: 2, Insightful

Why assume everyone knows your acronyms. To me RDF means "Reality Distortion Field". Zeesh, 7 billion triples or whatever.

--
There are 10 types of people in this world, those who can count in binary and those who can't.
1. Re:TMA: Too Many Acronyms by QuickFox · 2007-05-04 03:17 · Score: 4, Funny
  
  Why assume everyone knows your acronyms.
  
  OMG: Oh my God!
  WTF: What the fuck?
  BBQ: Barbecue.
  
  HTH
  
  --
  Terrorists can't threaten a country's freedom and democracy. Only lawmakers and voters can do that.
Could be interesting, but missing details by Anonymous Coward · 2007-05-04 03:11 · Score: 5, Interesting

What kind of data set did they use? The structure and contents of the graph that is the data in an RDF database has a huge impact on the performance of query execution, and different applications have different structures.

What kind of queries are they running? There are several different RDF query languages (think of SeRQL, RDQL, N3, SPARQL, etcetera) and some of them support quite complex queries. Quickly finding the answers to a simple query like
SELECT ?name WHERE ?name <http://xmlns.com/foaf/0.1/name> "John Smith"
is just a matter of an indexed lookup and not very special. But, like in SQL, much more complex expressions can be generated that require complex index operations on the query execution level. Having implemented an RDF database that supports SPARQL queries an order of magnitude faster than the software the W3C uses for their experiments (which, admitedly, doesn't have performance as a prime requirement), I know that it's possible to do simple things fast, but the interesting part is handling RDF queries that don't easily map to efficient database operations.

Which brings me to the most important point: where is their detailed report? Can I get the software somewhere and perform my own tests? The article is too vague to draw any conclusions about what their RDF database does, and how good it is. I'd love to read up on it, but I can't seem to find the information.
SUPER Speed by phoric · 2007-05-04 03:13 · Score: 2, Funny

Colonel Sandurz: Prepare ship for light speed. Dark Helmet: No, no, no. Light speed is too slow. Colonel Sandurz: Light speed is too slow? Dark Helmet: Yes. We're gonna have go right to... SUPER speed. [everybody gasps] Colonel Sandurz: SUPER speed? Sir, we've never gone that fast before. I do'nt know if this ship can take it. Dark Helmet: What's the matter Colonel Sandurz? Chicken? Colonel Sandurz: [Wimpering] Prepair ship! [Calms down] Colonel Sandurz: Prepare ship, for Ludicrous speed. Fasten all seat belts. [everybody fastens in their seat belts and locks all of the doors] Colonel Sandurz: Seal all entrances and exits. Lock all stores in the mall. Cancel the 3-ring circus. Secure all animals in the zoo... Dark Helmet: [Takes the intercom from Sandurz] Gimme that, you petty excuse for an officer! [speaks into the intercom as Sandurz puts on his seat belt] Dark Helmet: Now hear this, Ludicrous speed... Colonel Sandurz: [Interrupts] Sir, you better buckle up. Dark Helmet: [to Sandurz] Ah, buckle this. [Into the intercom] Dark Helmet: SUPER speed, go!
1. Re:SUPER Speed by VWJedi · 2007-05-04 04:59 · Score: 2, Informative
  
  [Wimpering] Prepair ship! [Calms down] Colonel Sandurz: Prepare ship, for Ludicrous speed. Fasten all seat belts.
  
  If you're going to steal a joke, you need to make sure to replace all references to the original. Find / Replace works great for this.
Re:Great!! by jrumney · 2007-05-04 03:18 · Score: 2, Informative

Actually, only RSS 1.0 is based on RDF. The only similarity between RDF and the more popular RSS 2.0 and RSS 0.92 is that they are all based on XML.
Here's the Tech Report by aharth · 2007-05-04 03:19 · Score: 5, Informative

Hello, I am one of the main developers of SWSE. True, the press release is vague, but there is only so much you can say in a press release aimed for the general public.

We have a Technical Report available at http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf that should answer most of the technical questions.

From the abstract:

"We present the architecture of an end-to-end search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web.

In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers.

We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements."
1. Re:Here's the Tech Report by $RANDOMLUSER · 2007-05-04 03:37 · Score: 3, Insightful
  
  You are too modest. You're the lead author. Congratulations on a first-rate contribution to mankind. And such a young pup, too.
  
  --
  No folly is more costly than the folly of intolerant idealism. - Winston Churchill
sounds fishy by vga_init · 2007-05-04 03:50 · Score: 2, Interesting

Of course a search based on meta data is going to be faster and more accurate, but only when the meta data is correct. We've had this since the beginning of the interweb; people would load up their pages with bogus meta data just to generate search traffic. Because of this dishonesty, search engines have had to resort to other methods of evaluating and indexing pages (for example, based on actual content).

I don't see any difference between this new RDF and that old stuff.
Developer on this project by aidhog · 2007-05-04 04:18 · Score: 3, Informative

As one of the developers on the project (along with user aharth), feel free to ask any specific questions you may have here. The article is quite vague and so I refer you to a technical report at http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf/.
1. Re:Developer on this project by aidhog · 2007-05-04 23:55 · Score: 2, Informative
  
  Text search on literals is supported through Lucene. Only vanilla keyword searches are currently supported. The query syntax for the indexing is a subset of SPARQL. The code is Java and has yet to be officially released.
Fixed URL by CaptSolo · 2007-05-04 05:36 · Score: 2, Informative

2 all: remove the ending slash '/' from the URL above, it will work then.

Correct link: http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf