Software is definitely only one piece of the equation. But I'm pretty sure that we are talking millions and not billions for the infrastructure/people to run a large search engine. Is the purpose to compete with Google and all of their various services or to provide an EU search engine alternative ?? Google spends tons of cash on projects like AdWords, GMail, GTalk, GMaps, Finance, GooTube and others that are not directly search engine services.
Why will this project cost billions when someone just needs to download an open source solution like Nutch (http://lucene.apache.org/nutch) and start injecting URLs?? While the Nutch algorithm is not on par with PageRank, it has parser plugins for virtually all popular doc types and should scale nicely due to the Hadoop distributed file system. Perhaps some European governments could even donate money or code to the project.
Presumably the reason for a European effort is coverage/content quality and privacy concerns. Nutch would address those issues...or is the real reason pride and the fear of "not-invented-here"?
Software is definitely only one piece of the equation. But I'm pretty sure that we are talking millions and not billions for the infrastructure/people to run a large search engine. Is the purpose to compete with Google and all of their various services or to provide an EU search engine alternative ?? Google spends tons of cash on projects like AdWords, GMail, GTalk, GMaps, Finance, GooTube and others that are not directly search engine services.
Ironic, indeed. There was some discussion that they don't have the personnel to maintain the project search engine. LOL
Why will this project cost billions when someone just needs to download an open source solution like Nutch (http://lucene.apache.org/nutch) and start injecting URLs?? While the Nutch algorithm is not on par with PageRank, it has parser plugins for virtually all popular doc types and should scale nicely due to the Hadoop distributed file system. Perhaps some European governments could even donate money or code to the project. Presumably the reason for a European effort is coverage/content quality and privacy concerns. Nutch would address those issues...or is the real reason pride and the fear of "not-invented-here"?