StefanDecker · Slashdot Mirror

← Back to Users

User: StefanDecker

StefanDecker's activity in the archive.

Stories: 0
Comments: 6
First seen: 2007-05-04
Last seen: 2007-05-04
Profile: (view on slashdot.org)

Comments · 6

Re:I'll prove him wrong on Super-Fast RDF Search Engine Developed · 2007-05-04 07:09 · Score: 2, Funny

OK, I concede. You won.
Some people can overestimate the importance ;-)
Re:Save the hype on Super-Fast RDF Search Engine Developed · 2007-05-04 06:43 · Score: 1

I guess in the early days of the Web many people said the same thing - why bother if nobody is providing HTML pages and nobody is using HTML browsers (in fact, I remember that time very well).
Of course building a web of data is more demanding - the infrastructure is far more complicated.
But we have made tremendous progress over the last years - to the point where currently structured data coming from applications like Wikis, Mailing Lists, Bulletin Boards can, should and will be integrated. And progress is being made - eg., with things like FOAF or SIOC (see http://sioc-project.org/.
The service http://pingthesemanticweb.com/ provides a good overview - progress may be slow, but Metcalfs Law did prevail in the past. Why should it not in the future?
And what is the alternative?
Re:RDF is a bad idea on Super-Fast RDF Search Engine Developed · 2007-05-04 06:26 · Score: 1

Zarf, you are absolutly correct that indeed raw RDF data can be polluted if crawled naively. That is exactly the reason why in all newer applications not the simple triple model is used, but actually quads, where the last argument may represent the source of the data. This data model is called named graphs.
So once you have the source recorded one is able to do trust computions with the graph and its source - eg., using pagerank like algorithms. Some sources can be assigned a low trust value, others can get assigned a higher one, based on their spam content and adoption of a web community (just like conventional webapges using pagerank).
Indeed the implementation that DERI reported on is realizing named graphs for exactly that reason, and Aidan is working on a ranking algorithm which is taking the source of the data into account.
Re:Two things... on Super-Fast RDF Search Engine Developed · 2007-05-04 04:08 · Score: 1

First: The experiments have been done on a 18 node cluster of cheap servers.
Second: There are other ways to get metadata - eg., via SIOC (see URL:http://sioc-project.org/>. But true, trust is an issue. And some people in DERI Galway are working on ranking algorithms on top of the search engine.
Re:Hype on Super-Fast RDF Search Engine Developed · 2007-05-04 04:03 · Score: 2, Insightful

We have a Technical Report available at http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf that should answer most of the technical questions. From the abstract: "We present the architecture of an end-to-end search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers. We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements."
Re:This could be huge on Super-Fast RDF Search Engine Developed · 2007-05-04 03:58 · Score: 1

Fully agreed. But it worked for RSS - and it also seems to work for SIOC (see http://sioc-project.org/ ). Other XML structured formats are also catching on - eg., XBRL. All of them can be (quite easily) translated in a graph and integrated. So there is hope. However, Andreas and Aidans work reported on in the press release enables us to build scalable engines - scalability was a major headache before.