Extracting Meaning From Millions of Pages
freakshowsam writes "Technology Review has an article on a software engine, developed by researchers at the University of Washington, that pulls together facts by combing through more than 500 million Web pages. TextRunner extracts information from billions of lines of text by analyzing basic relationships between words. 'The significance of TextRunner is that it is scalable because it is unsupervised,' says Peter Norvig, director of research at Google, which donated the database of Web pages that TextRunner analyzes. The prototype still has a fairly simple interface and is not meant for public search so much as to demonstrate the automated extraction of information from 500 million Web pages, says Oren Etzioni, a University of Washington computer scientist leading the project." Try the query "Who has Microsoft acquired?"
I suppose the major problem with this is that it cannot tell the difference between truth and lies or urban legends, it just repeats what other people have said, even if they are conspiracy theorists. The query "Who killed JFK?" suggests the CIA did it.
I tried half a dozen queries of the sort I often use Google for (example: "What is the velocity of sound in hydraulic fluid?"). No answers.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Try "Who paid SCO?" Concise, to the point. Nice.
Allowing a search engine to visit a site and allowing somebody to pass your web page content around are two completely different things.
I would go with...
But meters per second and miles per hour? WHY?!
I think you're missing the point. This is an AI project - it's research. Presumably, the questions you are typing in haven't been processed by a complicated nest of if-thens written by someone who knows English; instead, statistical models of language and meaning were extracted from the internet. Some people claim this is the equivalent of "teaching" a computer.
The first example, which is what most search engines do, leads to impressive search results but is limited by the logic people can code up. This AI, on the other hand, may be a primitive example of the way Google will work 15 years from now.