Slashdot Mirror


EU Google Competitor Project Gets Aid Worth $166 Million

mernil wrote with the news that the EU Commission has given the go-ahead to provide funding for Germany's search engine project, called Theseus. Early this year we discussed Germany's withdrawal from the French project Quaero. From the outside, it looks like the EU Commission is unwilling to put all its eggs in one basket, funding the German project to the tune of 120 million euro, or $US 166 million. Dow Jones reports: "The aim is to develop new search technologies for the next generation Internet, including 'semantic technologies which try to recognize the meaning of content and place it in its proper context.' The semantic Web has been considered the next evolution of the Internet at least since Tim Berners-Lee, widely considered a creator of the current version of the Internet, published an article describing it in 2001. In theory, a semantic Web could receive a user request for information about fishing, for example, and automatically narrow the results according to the user's individual needs rather than blanket the user with pages related to numerous aspects of fishing. The Commission's funding approval Thursday immediately sparked talk of building a potential European challenger to Web search leader Google Inc."

1 of 111 comments (clear)

  1. Next generation search technology by the_kanzure · · Score: 4, Interesting

    Let the user become the crawler- and do not eliminate the search giants (just don't rely on them completely). Already I sort of operate like a (slow) crawler with my queues of links to read, bookmarks (be weary- big load) and indexing those very interesting or important pages, sharing related tidbits, etc. Just feels like the natural extension, though I am sure that many people will want to stick with traditional GUIs and "back/forward" habits. There is also some interesting discussion in ATLAS-L re: future search infrastructures. So, in the spirit of promoting development in this area, linkage:

    * Grub article (now defunct)- was distributed peer-to-peer crawler. (see also)
    * Boitho, another distributed crawler
    * YaCy- another peer-to-peer crawler
    * How to build a web spider
    * C++ web crawler lib
    * LibWWW (perl)
    * W3C's WebBot
    * The Internet Archive's Heritrix crawler
    * WebSPHINX- customizable crawler

    Somehow, this is like an extension of surfraw. I imagine that soon enough we will start up an open source crawler-browsing hybrid software package, though have been surprised that nothing like it has popped up yet- it's (usually) the way of the programmer to make sure that he has the ability to do what the giants are doing. Maybe we have all been collectively blinded by graphical web browsers (IE, Firefox, Opera, etc.) and "click-click-click" thinkware?