Slashdot Mirror


Extracting Meaning From Millions of Pages

freakshowsam writes "Technology Review has an article on a software engine, developed by researchers at the University of Washington, that pulls together facts by combing through more than 500 million Web pages. TextRunner extracts information from billions of lines of text by analyzing basic relationships between words. 'The significance of TextRunner is that it is scalable because it is unsupervised,' says Peter Norvig, director of research at Google, which donated the database of Web pages that TextRunner analyzes. The prototype still has a fairly simple interface and is not meant for public search so much as to demonstrate the automated extraction of information from 500 million Web pages, says Oren Etzioni, a University of Washington computer scientist leading the project." Try the query "Who has Microsoft acquired?"

138 comments

  1. Try the query.... by Finallyjoined!!! · · Score: 3, Funny

    "Who has dumped Vista?"

    --
    If I had an Ass, I'd call it Fanny Bottom, then I could slap my Ass; Fanny Bottom, on the Arse.
    1. Re:Try the query.... by Anonymous Coward · · Score: 0

      If I had a fanny 'd be a woman

    2. Re:Try the query.... by Anonymous Coward · · Score: 0

      No, try "Who has VA Linux acquired?"

    3. Re:Try the query.... by fmarkham · · Score: 1

      "How long before TextRunner is slashdotted?"

    4. Re:Try the query.... by drinkypoo · · Score: 1

      Oh man, it's the new sucks-rules-o-meter for sure. Who hates vista: 55 results. Who loves vista: 11 results. Obviously, vista blows hairy goats. It becomes even more clear when you look at the actual results: somehow "
      Bookmark Islamic Screensaver download-All people (12) love screensaver-Windows Vista Downloads" counts as a hit.. ahh there, a reload with js and the spam disappears leaving 9 :D

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    5. Re:Try the query.... by conspirator57 · · Score: 1

      What happens to TextRunner once it is slashdotted?

      --
      "If still these truths be held to be
      Self evident."
      -Edna St. Vincent Millay
    6. Re:Try the query.... by VulpesFoxnik · · Score: 1

      The singularity begins.

      --
      RES PUBLICA NON DOMINETUR
    7. Re:Try the query.... by Chninkel · · Score: 1
      Why the query

      Who has not been acquired by Microsoft

      doesn't return Yahoo ?
      actually it doesn't return any result ...

    8. Re:Try the query.... by Nutria · · Score: 1

      Better yet: "Why does Windows suck?"

      Retrieved 0 results for Why does Windows suck?.

      Being in Washington, MSFT has obviously paid them off to filter out unpleasant results.

      --
      "I don't know, therefore Aliens" Wafflebox1
    9. Re:Try the query.... by maxume · · Score: 3, Funny

      I tried to read your comment, but I did not attempt to understand it.

      --
      Nerd rage is the funniest rage.
    10. Re:Try the query.... by Anonymous Coward · · Score: 0

      And if your ass was limping, you could slap your bum ass, Fanny, on the Arse. sorry to threadjack with a sig comment, no one else will give this a read anyhow.

  2. Not entirely helpful by CRCulver · · Score: 5, Interesting

    I suppose the major problem with this is that it cannot tell the difference between truth and lies or urban legends, it just repeats what other people have said, even if they are conspiracy theorists. The query "Who killed JFK?" suggests the CIA did it.

    1. Re:Not entirely helpful by Random2 · · Score: 1

      Yeah, it's something you'd have to cross-reference, but the main use I see for it is the initial search for information. You ask a question, it gives some answers, then you type them into yahoo or something to look them up/verify what it said. This could be a huge help for things that one may not know a lot about.

      --
      "Our goal each year should be to increase the number of goals we set for ourselves!"
    2. Re:Not entirely helpful by John+Hasler · · Score: 2, Insightful

      The major problem is that it assumes the presence of meaning in Web pages in the first place.

      --
      Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    3. Re:Not entirely helpful by morgan_greywolf · · Score: 2, Interesting

      Actually, just like any other search, it just shows ALL of the likely results and you are still responsible for determining for yourself which of the statements is true. It says "CIA killed JFK" but the first result it returns is "Lee Harvey Oswald killed JFK". It also seems to pare down the results somewhat, because I know I've seen conspiracies also suggesting that the KGB killed JFK, or that the Mafia killed JFK. I'm guessing that more people think the CIA killed JFK than the KGB or the Mafia.

    4. Re:Not entirely helpful by owlnation · · Score: 4, Funny

      I suppose the major problem with this is that it cannot tell the difference between truth and lies or urban legends, it just repeats what other people have said, even if they are conspiracy theorists. The query "Who killed JFK?" suggests the CIA did it.

      So much like Wikipedia then?

    5. Re:Not entirely helpful by datapharmer · · Score: 1

      shhhh! we aren't supposed to talk about the grassy knoll or the bad men will come again.

      --
      Get a web developer
    6. Re:Not entirely helpful by L4t3r4lu5 · · Score: 1

      ... and yet "Who was responsible for the World Trade Centre attacks?" returns no results...

      [/tinfoilhat]

      --
      Finally had enough. Come see us over at https://soylentnews.org/
    7. Re:Not entirely helpful by Anonymous Coward · · Score: 0

      I suppose the major problem with this is that it cannot tell the difference between truth and lies or urban legends, it just repeats what other people have said, even if they are conspiracy theorists. The query "Who killed JFK?" suggests the CIA did it.

      For more fun, try "Who blew up the WTC?" , "Who developed the AIDS virus?" , or "Who controls world power?"

      But to call this a problem with TextRunner is a bit unfair. It's still an interesting tool for looking at the content of the web. Yeah, the web is mostly populated by kooks, but that's life, isn't it?

    8. Re:Not entirely helpful by phatsphere · · Score: 1

      you know, one of the main reasons google is scanning books, is, to feed them into an AI !

    9. Re:Not entirely helpful by ericrost · · Score: 1

      That would be because "centre" is spelled center. The correct spelling yields plenty of results.

    10. Re:Not entirely helpful by Anonymous Coward · · Score: 0

      CIA hasn't killed JFK?

    11. Re:Not entirely helpful by L4t3r4lu5 · · Score: 1

      Damn my correct spelling of English words! I suppose as a proper noun, I can forgive this slip up.

      --
      Finally had enough. Come see us over at https://soylentnews.org/
    12. Re:Not entirely helpful by jerep · · Score: 2, Insightful

      it just repeats what other people have said

      I don't see anything new here, most people have done this since the beginning of time.

    13. Re:Not entirely helpful by thedonger · · Score: 2, Funny

      it just repeats what other people have said

      I don't see anything new here, most people have done this since the beginning of time.

      Yeah, Textrunner just repeats what other people have said, like most people since the beginning of time.

      --
      Help fight poverty: Punch a poor person.
    14. Re:Not entirely helpful by somersault · · Score: 2, Interesting

      I suppose the major problem with this is that it cannot tell the difference between truth and lies or urban legends

      Most humans can't either, how do you expect a search engine to?

      There will be a lot of false positives and negatives that will be hard to identify as such unless it directly works with something like snopes.com , which kind of defeats the purpose because it means someone has had to research every question anyway.

      If a project like this which simply scoured the whole 'net, you wouldn't really be able to verify anything beyond people's opinions or beliefs, which may or may not be 'true'.

      I think something like this would work really well for factual results if it was only allowed to draw conclusions from verified sources, say something like Wikipedia articles that have been verified by experts in the appropriate field (I've not been following all this type of thing recently but perhaps that is what Wolfram Alpha does already). It could perhaps be useful to have it search the general internet for supplementary results for some questions though, especially those of a philosophical nature where it may be impossible to establish definite answers ("is there a god" and the like).

      --
      which is totally what she said
    15. Re:Not entirely helpful by Anonymous Coward · · Score: 0

      Shut the fuck up, idiot

    16. Re:Not entirely helpful by L4t3r4lu5 · · Score: 1

      Your originality is surpassed only by your insight.

      --
      Finally had enough. Come see us over at https://soylentnews.org/
    17. Re:Not entirely helpful by db10 · · Score: 0, Redundant

      Yeah, Textrunner just repeats what other people have said, like most people since the beginning of time.

    18. Re:Not entirely helpful by knewter · · Score: 1

      And you suggest they didn't?

      --
      -knewter
    19. Re:Not entirely helpful by msbmsb · · Score: 2, Informative

      Semantic processing systems like this (it's not something new) aren't usually able to determine correctness. The truth of a statement is assumed and the best these NLP engines can do at the moment is identify conflicts and maybe use some reputation metrics to assign a veracity rating to a particular statement, or notify the user that there are differing conclusions. These systems are just really, like the summary states, "information extraction" systems. Just as a regular search engine will return you the results from the data set, that's what these types of semantic extraction engines usually do, except the data is processed in a semantically-organized way so that you can query with semantics/natural language constraints instead of just keywords and boolean constraints.

      There are some that incorporate some intention or opinion polarity detection, but even those are not capable to sorting "truth" versus "conspiracy".

      Additionally, semantic extraction output, like named entities and semantic relations, are useful for many other applications.

    20. Re:Not entirely helpful by Brett+Buck · · Score: 1

      Damn those search engines that presume the exact spelling of proper nouns!

    21. Re:Not entirely helpful by Anonymous Coward · · Score: 0

      It's kind of like Wikipedia, then...

    22. Re:Not entirely helpful by Anonymous Coward · · Score: 0

      Or it could just mean that more people wrote about it.

      Many subject areas have vocal minorities. Also, authors often write to stir controversy rather than disseminate factual information (the latter having been deemed too boring).

    23. Re:Not entirely helpful by holmstar · · Score: 1

      Let's hope they stay away from the horror section.

    24. Re:Not entirely helpful by halivar · · Score: 1

      Damn my correct spelling of English words!

      I'll bite. What's the correct English pronunciation for "chauvinism"?

    25. Re:Not entirely helpful by hoooocheymomma · · Score: 1

      Why did the CIA kill JFK?

    26. Re:Not entirely helpful by Anonymous Coward · · Score: 0

      John Sladek figured this one out years (decades) ago:

      Since everyone in the world remembers where they were when JFK was shot, you just need to *ask* everyone - and arrest anyone who remembers being in Dealey Plaza with a rifle in his/her hand...

    27. Re:Not entirely helpful by Walt+Dismal · · Score: 1

      When humans read, we maintain a running context against which we compare each new sentence. We build a background model and fit knowledge into it, and use it to judge validity of possible interpretations of sentences. I have a problem with systems like TextRunner that purport to extract meaning from single and simple Subject-Verb-Object types of structures. The problem is the lack of broader comparison with existing knowledge and any attempt to reconcile truth and meaning with it. I guess another way of putting is, TextRunner is a 'face value truths' system, and that's so easily corrupted by specious inputs. I believe they're going down an easy path for knowledge extraction but not the right one. Also, they're missing a vital key element, which is that interpretation of sentences is highly dependent on cultural contexts and their system has no provision for that. Being able to make sense of "I put the luggage in the boot" depends on whether you're English or American.

  3. So someone donated a copy of my copyrighted pages? by Anonymous Coward · · Score: 0

    What the heck.

  4. Nascent AI? by Drakkenmensch · · Score: 4, Funny
    I've always viewed intelligence as the ability to take unrelated facts and create new and original ideas from their synthesis. This project may very well lead to new ideas to create the first true AI.

    I'll start stockpiling food and armor piercing rounds for the moment Skynet goes live.

    1. Re:Nascent AI? by thedonger · · Score: 1

      I've always viewed intelligence as the ability to take unrelated facts and create new and original ideas from their synthesis.

      Intelligence, like insanity, is finding links between seemingly unrelated facts. It can also be keen observation and recognition of interactions between things where others see chaos. Either way, truly unrelated things are just that: unrelated.

      --
      Help fight poverty: Punch a poor person.
    2. Re:Nascent AI? by thedonger · · Score: 1

      I should add that the distinction between intelligence and insanity blurs as the relationship between the facts becomes weaker. Well, at least to the observer of the intelligent/insane person.

      --
      Help fight poverty: Punch a poor person.
    3. Re:Nascent AI? by gurps_npc · · Score: 1
      It's easy to create new and original ideas. A random generator can do that. What is difficult is selecting the valuable ones.

      I generally view intelligence as the ability to detect and recognize patterns. If you are good at exact patterns, that is math/logic/science. If you are good at general patterns, then we are talking art/creativity/language.

      Computers have ALWAYS been good at recognizing exact patterns. But they generally need a human to first detect the pattern. They have never been good at recognizing general patterns.

      --
      excitingthingstodo.blogspot.com
    4. Re:Nascent AI? by Anonymous Coward · · Score: 0

      Don't worry. It isn't self-conscious yet!
      Try "what is text runner?"

    5. Re:Nascent AI? by 2obvious4u · · Score: 1

      Skynet is already live and it is us. Google: unmanned military vehicles; if you don't believe me. The way we disassociate killing from a distance with the act itself is scary.

  5. 500 million web pages can't be wrong by Dunbal · · Score: 4, Funny

    Yet strangely, I get a result of:

    TextRunner took 9 seconds.
    Retrieved 0 results for what is the airspeed velocity of an unladen swallow?.

    Meh, call me when this stuff can answer the really USEFUL questions in life.

    --
    Seven puppies were harmed during the making of this post.
    1. Re:500 million web pages can't be wrong by eulernet · · Score: 1

      Simply because grepping 500 million pages is slow.

    2. Re:500 million web pages can't be wrong by Anonymous Coward · · Score: 0

      Well, what did you expect... Did you mean an African or a European swallow?

    3. Re:500 million web pages can't be wrong by JDHannan · · Score: 3, Funny

      And even worse:

      Retrieved 0 results for what is the answer to life, the universe and everything?.

    4. Re:500 million web pages can't be wrong by datapharmer · · Score: 1

      Yep. I concur - it couldn't even tell me what the number 42 was used for!

      --
      Get a web developer
    5. Re:500 million web pages can't be wrong by sukotto · · Score: 4, Funny

      Obviously it's not indexing http://www.style.org/unladenswallow/

      estimate that the average cruising airspeed velocity of an unladen European Swallow is roughly 11 meters per second, or 24 miles an hour.

      --
      Come play free flash games on Kongregate!
    6. Re:500 million web pages can't be wrong by maxwell+demon · · Score: 1

      Well, at least you got 0 results. With "where is New York" I got -1 result!

      --
      The Tao of math: The numbers you can count are not the real numbers.
    7. Re:500 million web pages can't be wrong by maxwell+demon · · Score: 1

      Just found out: If you just type "airspeed velocity", you'll get as first two results:

      airspeed velocity of an unladen swallow is roughly 11 meters (10), 24 miles (9), 10 meters (2)
      average cruising airspeed velocity of an unladen European Swallow is 24 mph (2)

      It seems to have trouble understanding units, but otherwise the information is found.

      --
      The Tao of math: The numbers you can count are not the real numbers.
    8. Re:500 million web pages can't be wrong by smallfries · · Score: 1

      Try: "where is the colleseum"

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    9. Re:500 million web pages can't be wrong by bertoelcon · · Score: 1

      Does that mean it sucked the knowledge out of your head? Rather that give a 0 or a positive that would add more knowledge?

      --
      Anything can be found funny, from a certain point of view.
    10. Re:500 million web pages can't be wrong by bertoelcon · · Score: 1

      Given the number of pages on the web, It is highly plausible that 500 million pages can be wrong. It is much harder to believe that 500 million incorrect pages are the entire database it runs.

      --
      Anything can be found funny, from a certain point of view.
  6. Textrunner confirms it by Anonymous Coward · · Score: 0

    "Retrieved 0 results for Is Linux ready for the desktop?."

    1. Re:Textrunner confirms it by maxwell+demon · · Score: 1

      Even worse: I asked "What is Slashdot?" and the first result was "Digg is Slashdot" ...

      --
      The Tao of math: The numbers you can count are not the real numbers.
  7. Zero results by John+Hasler · · Score: 2, Interesting

    I tried half a dozen queries of the sort I often use Google for (example: "What is the velocity of sound in hydraulic fluid?"). No answers.

    --
    Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    1. Re:Zero results by n30na · · Score: 1

      You're confusing this with wolfram alpha, methinks.

    2. Re:Zero results by John+Hasler · · Score: 1

      I didn't find Wolfram Alpha much help with such queries either. Besides, I just followed their advice on how to use it.

      --
      Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
  8. Concise by moogsynth · · Score: 2, Interesting

    Try "Who paid SCO?" Concise, to the point. Nice.

    1. Re:Concise by Anonymous Coward · · Score: 0
      I'm stealing your karma points for when the site gets slashdotted. The answer of course is

      Linux customers paid SCO $ 10 million (2)
      plans to pay SCO fees (2)
      Microsoft paid SCO approximately $ 16 million (2)

  9. Towards a web with only one page: Google by Anonymous Coward · · Score: 1, Insightful

    Are we moving towards a web in which Google centralises everything on their own pages? These new engines present content without the need to visit pages it originates from. Is Google basically mooching off other people's websites with hardly anything - if anything at all - in return?

    It could be dangerous if the only visitor a web site can expect is the Google bot.

  10. Re:So someone donated a copy of my copyrighted pag by morgan_greywolf · · Score: 1

    The same copyrighted pages that you allowed Google to crawl since you obviously didn't protect it with a robots.txt?

  11. Re:So someone donated a copy of my copyrighted pag by Anonymous Coward · · Score: 0

    What? You've found a search engine that honors robots.txt?

  12. A million monkeys on keyboards... by Gothmolly · · Score: 1

    But AOL is nothing like Shakespeare.

    --
    I want to delete my account but Slashdot doesn't allow it.
    1. Re:A million monkeys on keyboards... by Phoenixlol · · Score: 1

      it's alot like the porter

  13. Re:So someone donated a copy of my copyrighted pag by Anonymous Coward · · Score: 2, Interesting

    Allowing a search engine to visit a site and allowing somebody to pass your web page content around are two completely different things.

  14. Not entirely helpful predicting the futute? by thijsh · · Score: 1

    Why deal with uncertainties about who-killed-who in the past, when you can have a lot more fun with what could be in the future.
    "Who killed obama?" ... seems an inside job by Hillary is most probable just below a vicious murder by Ted Nugent. Scary!

  15. what causes cancer? by umundane · · Score: 5, Funny

    I learned that

    > smoking (387) causes cancer.

    I was also surprised to learn that

    > girls and women (11) cause most cases of cervical cancer

    This is a great resource if you need to cite a reference for a Wikipedia article.

  16. TextRunner confirms it: by guruevi · · Score: 4, Funny

    Who is at Area 51
    aliens (3), Carter (2), Colonel Sanders (2), Hi Group (2) is at Area 51

    Who bombed WTC
    Al Qaeda (5), Bush (5), Clinton (2), 4 more... bombed the WTC

    Who built the pyramids (example on site):
    Egyptians (298), aliens (73), Pharaohs (40), 77 more... built the pyramids

    What contains antioxidants (example on site):
    Coffee (17), Recent scientific research (15), food (6), 5 more... contain significant amounts of antioxidants

    -- man, I gotta get me some more recent scientific research.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
    1. Re:TextRunner confirms it: by houghi · · Score: 1

      Retrieved 0 results for what is the answer to life, the universe and everything?.

      --
      Don't fight for your country, if your country does not fight for you.
    2. Re:TextRunner confirms it: by Anonymous Coward · · Score: 0

      That one's too easy :-)

    3. Re:TextRunner confirms it: by mpthompson · · Score: 1

      Who bombed WTC
      Al Qaeda (5), Bush (5), Clinton (2), 4 more... bombed the WTC

      So, what I really want to know is who those "4 more" where...

  17. Wikipedia tried and failed by Anonymous Coward · · Score: 1

    That is how Wikipedia was meant to be. A group of statements about subjects, all of which can be referenced to some original source. So that people can look up something quickly and then look at the sources for more definite information....

    Seeing how many people cite Wikipedia directly, use it as the main source for their research and the amount of newspapers that have been reported to directly quote inaccurate facts from Wikipedia... I don't think it is working properly. It requires a lot of optimism to believe "People will use that as a initial source and then verify the information"

    1. Re:Wikipedia tried and failed by Colonel+Korn · · Score: 3, Insightful

      That is how Wikipedia was meant to be. A group of statements about subjects, all of which can be referenced to some original source. So that people can look up something quickly and then look at the sources for more definite information....

      Seeing how many people cite Wikipedia directly, use it as the main source for their research and the amount of newspapers that have been reported to directly quote inaccurate facts from Wikipedia... I don't think it is working properly. It requires a lot of optimism to believe "People will use that as a initial source and then verify the information"

      That's not wikipedia's failure. Those same people would just be referencing nothing or a web site with zero public review and commenting without it.

      --
      "I zero-index my hamsters" - Willtor (147206)
  18. Slashdot is not ... by Xyberu · · Score: 2

    Slashdot isn't
            a professional news site
            a normal news site
            a social news site
            a News Site
            a valid source
            a reputable source
            the right source
            a healthy online community
            a goddamn online community
            a Terrorist Organization

    1. Re:Slashdot is not ... by unfasten · · Score: 1
      Ah, but look at what Slashdot is:

      Slashdot is the single most important english site (8), another extremely sophisticated example (4), another online community (3), 15 more...

  19. I'd like to see it extracting Millions of Meanings by Klistvud · · Score: 0

    ...from me being completely silent, mouth shut and all, like my wife does! And she never had a single reboot in 43 years! Then again ... maybe that's precisely the problem?

    --
    Intellectual Property: an immaterial non-entity, most fiercely contended by those with no proper intellect to speak of.
  20. Bah useless by Veretax · · Score: 1

    I tried asking the real name of Doctor Who, and the site basically crapped out LOL, totally useless.

  21. meters per second or miles per hour? what? by Anonymous Coward · · Score: 1, Interesting

    I would go with...

    • ...meters per second and kilometers per hour
    • ...feet per second and miles per hour
    • ...feet per second and meters per second
    • ...miles per hour and kilometers per hour

    But meters per second and miles per hour? WHY?!

    1. Re:meters per second or miles per hour? what? by Evanisincontrol · · Score: 1

      They wanted a metric measure and a standard measure. Meters per second is a reasonable metric measure for something slow(er than a car), and miles per hour... is basically the only speed measure that Americans understand. (No flaming, I'm American).

    2. Re:meters per second or miles per hour? what? by sukotto · · Score: 1

      It's in reference to Monty Python... you're lucky it was comprehensible at all.

      and now for something completely different...

      --
      Come play free flash games on Kongregate!
  22. User invalid, deleting user by uncanny · · Score: 1

    I typed in "how does a computer become self aware?" it just said something about it being busy because it's currently controlling california!

    1. Re:User invalid, deleting user by John+Hasler · · Score: 1

      So the Governator isn't responsible for the financial mess there!

      --
      Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    2. Re:User invalid, deleting user by PPH · · Score: 1

      Come on now, be specific. What it actually said was, "I'll be back".

      --
      Have gnu, will travel.
  23. online as in real life by xerxesVII · · Score: 1

    "Who is your daddy?" got 0 results.

    --
    "We shall grapple with the ineffable, and see if we may not eff it after all." - Douglas Adams
  24. 'How do I shot web?' by John+Guilt · · Score: 1

    No answers.

  25. Oops by danwesnor · · Score: 1

    The significance of TextRunner is that it is scalable because it is unsupervised.

    That's what they said about SkyNet.

    1. Re:Oops by maxwell+demon · · Score: 1
      --
      The Tao of math: The numbers you can count are not the real numbers.
  26. source code ? implementation details? by Anonymous Coward · · Score: 0

    is this closed source ? Any idea what language this is implemented in ?

    1. Re:source code ? implementation details? by Zappy · · Score: 1

      # # An unexpected error has been detected by HotSpot Virtual Machine: # # SIGSEGV (0xb) at pc=0xb77acafa, pid=21855, tid=1833073568 # # Java VM: Java HotSpot(TM) Server VM (1.5.0_14-b03 mixed mode) # Problematic frame: # V [libjvm.so+0x23dafa] # # An error report file with more information is saved as hs_err_pid21855.log # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # Abort

  27. Human cities torture? by YourExperiment · · Score: 1

    Apparently Mount Marcy, Mount Elbrus, Mount Kilimanjaro and Mount Etna are all the highest mountain. Then again, I was also informed that "high mountains are the hum of human cities torture", so I think I'll just steer clear of mountains altogether.

  28. Screw that by Anonymous Coward · · Score: 0

    Why is my TV suddenly not working anymore?

  29. "What is Slasdot?" by Arthur+B. · · Score: 1

    Try
    "What is Slasdot?"

    Answer
    Digg is Slashdot

    --
    \u262D = \u5350
  30. I asked the obvious.... by smfreegard · · Score: 1

    Retrieved 0 results for what is the answer to life, the universe and everything.

    FAIL!

    1. Re:I asked the obvious.... by nigham · · Score: 1

      That needs a much bigger computer, IIRC. Roughly the size and complexity of Earth, its ecosystem and its organisms.

      --
      I don't want to read /. I want to go home and re-think my life.
  31. something i've never head of by n30na · · Score: 1

    Turns out to be way cooler than Wolfram Alpha. Now just think if it has the whole web. Wait, scratch that, I bet wikipedia's already in there. Also, skynet.

  32. Correction.... by wowbagger · · Score: 4, Insightful

    "...that pulls together facts by combing through more than 500 million Web pages."

    Correction:

    "...that pulls together assertions by combing through more than 500 million Web pages."

    Whether those assertions are correct or even reasonable is a completely different issue.

    It might be interesting to then take those assertions and have some means to validate or invalidate them, but currently that's going to require meat, not metal.

    Now, if you could come up with some form of AI^Walgorithm to do that automatically, then you would have something.

    1. Re:Correction.... by ignavus · · Score: 1

      Correction:

      "...that pulls together assertions by combing through more than 500 million Web pages."

      I suspects it just pulls together *sentences*.

      --
      I am anarch of all I survey.
  33. Re:Exactly by bxbaser · · Score: 2, Funny

    "The query "Who killed JFK?" suggests the CIA did it"

    Hmmm....And now its not responding because its "slashdotted"

  34. Re:So someone donated a copy of my copyrighted pag by John+Hasler · · Score: 1

    You did.

    --
    Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
  35. "Who has Microsoft acquired?" by glwtta · · Score: 1

    So I take it this thing also hates grammar?

    --
    sic transit gloria mundi
  36. What is the meaning of life by soundhack · · Score: 1

    love (53), song (19), Life (16), 81 more... is the meaning of life

    1) of the 81 more, 42 doesnt show up anywhere
    2) the stupid javascript hiding makes copy and paste a pain

  37. Who Framed Roger Rabbit? by Mork29 · · Score: 1

    same fuckers(2) that Framed Roger Rabbit

  38. Who killed Kennedy? by aaaaaaargh! · · Score: 1
    ==> CIA (26), Lee Harvey Oswald (18), Castro (13)

    I knew it...

  39. Found the cause of global warming.... by Anonymous Coward · · Score: 0

    Retrieved 8 results for What causes global warming?

    Human(10) vommitting.BUTT PLUGS (2).

    Sorry everyone... I'll take it out.

  40. Wow, impressive, but prior art... by Pedrito · · Score: 1

    TextRunner gets rid of that manual labor. A user can enter, for example, "kills bacteria," and the engine will come up with of pages that offer the insights that "chlorine kills bacteria" or "ultraviolet light kills bacteria" or "heat kills bacteria"--results called "triples"--and provide ways to preview the text and then visit the Web page that it comes from.

    Wow, incredible. Because doing a search of "kills bacteria" with the quotes on Google won't get you those kind of results. Oh wait, yeah it will. In fact, it too will "chlorine kills bacteria" and "ultraviolet light kills bacteria" and "heat kills bacteria". And google also provides a way to preview the text and then visit the web page that it comes from.

    Yeah, I know, I know, they just put a bad example in the article, but it's a ridiculously bad example.

    1. Re:Wow, impressive, but prior art... by dvice_null · · Score: 1

      Query: What kills Microsoft:

      First in the list: Linux, Sony, Apple
      Other notable: Steve Jobs

    2. Re:Wow, impressive, but prior art... by dvice_null · · Score: 1

      Query: What kills Linux

      On the list, Microsoft, Dell, Apple
      And... Steve Jobs.

    3. Re:Wow, impressive, but prior art... by dvice_null · · Score: 1

      This Steve Jobs sounded pretty good killer, so I did a query:
      What kills Steve Jobs, the result was:

      Retrieved 0 results for what kills Steve Jobs.

    4. Re:Wow, impressive, but prior art... by rm999 · · Score: 2, Interesting

      I think you're missing the point. This is an AI project - it's research. Presumably, the questions you are typing in haven't been processed by a complicated nest of if-thens written by someone who knows English; instead, statistical models of language and meaning were extracted from the internet. Some people claim this is the equivalent of "teaching" a computer.

      The first example, which is what most search engines do, leads to impressive search results but is limited by the logic people can code up. This AI, on the other hand, may be a primitive example of the way Google will work 15 years from now.

  41. I'm impressed by thethibs · · Score: 1

    This has to be played with to be appreciated. On request, it delivered a set of interesting papers about US-EPA misrepresentation of science. And, it returned a nul result for "Has any climate model been validated?"

    This is going to be fun

    --
    I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
  42. Carmen San Diego? by tech_fixer · · Score: 1

    I asked "Where in the world is Carmen San Diego?". The page trhew up a Java error.

    I guess nobody really knows.

  43. In Soviet Russia... by Hurricane78 · · Score: 1

    ... you extract millions from the meaning of pages! ;)

    Sorry, couldn't resist.

    --
    Any sufficiently advanced intelligence is indistinguishable from stupidity.
    1. Re:In Soviet Russia... by DriedClexler · · Score: 1

      No, the correct Russian reversal would be,

      "In Soviet Russia, millions of paiges extract meaning from YOU!"

      --
      Information theory is life. The rest is just the KL divergence.
  44. What makes grass grow? by Anonymous Coward · · Score: 0

    What makes grass grow?

    Answer - (1 thing)

    blood...

  45. Who performs warrantless wiretapping? by Anonymous Coward · · Score: 0

    TextRunner took 2 seconds.
    Retrieved 0 results for who performs warrantless wiretapping.

  46. Who is John Galt? by Anonymous Coward · · Score: 0

    0 results.

  47. Why WTC name is spelled in American by tepples · · Score: 2, Informative

    Damn my correct spelling of English words!

    Because the World Trade Center was located on American soil, its name is spelled in American dialect.

    1. Re:Why WTC name is spelled in American by Anonymous Coward · · Score: 0

      Because the World Trade Center was located on American soil, its name is spelled in American dialect.

      That's why Moscow is actually called Moskva in English-speaking countries, right?

    2. Re:Why WTC name is spelled in American by tepples · · Score: 1

      That's why Moscow is actually called Moskva in English-speaking countries, right?

      I was talking about names within a language, not names across languages. For instance, in YTMND-land, it's spelled Moskau in German.

    3. Re:Why WTC name is spelled in American by Anonymous Coward · · Score: 0

      No, it is because Americans are now the standard for the language, along with everything else that is good in the world.

  48. Not too Smart: "What is TextRunner" by Phrogman · · Score: 1

    produces 0 results :P

    --
    "The first time I got drunk, I got married. The second time I bought a chimpanzee, after that I stayed sober" Arian Seid
  49. Retrieved 1 result for does god exist by ebertx · · Score: 2, Funny
    Retrieved 1 result for does god exist. God DOES exist last night (2).

    Well, that answers that question.

  50. Slashdotted by jvkjvk · · Score: 1

    'The significance of TextRunner is that it is scalable because it is unsupervised,' says Peter Norvig, director of research at Google,

    I really wondered what he was getting at with this. It seems almost nonsensical, like something someone in marketing would come up with.

    Now that the site is slashdotted I know that he means if only a few people use it, it's very scalable, but if a bunch of people are directed to use it (say, through Slashdot) then it doesn't scale very well.

    1. Re:Slashdotted by Virak · · Score: 1

      There's nothing nonsensical about it. Just because you don't know what an unsupervised learning algorithm is doesn't mean it's just a random string of words he threw together to sound fancy.

  51. Re:I'd like to see it extracting Millions of Meani by Faerunner · · Score: 1

    If I had points, parent would be modded funny. This is an interesting resource... but it doesn't answer the real question: Coke, or Pepsi?

  52. Why is marijuana illegal? by Timtimes · · Score: 1

    No answer provided. Enjoy.

    --
    This ain't no upwardly mobile freeway This is the road to hell
  53. Who invented the internet? by Anonymous Coward · · Score: 0

    "Who invented the internet?"

    Gore (396), Americans (24), US (10) 47 more... invented the Internet

  54. Flawed. by Anonymous Coward · · Score: 0

    "Bush (34) is the best president of the US"

  55. I must admit it's rather entertaining... by Kazoo+the+Clown · · Score: 1

    Entering the query "Who is George Bush?" returned the following tidbits among other things:

    General Draper was George Bush's guru
    Hurricane Katrina is George Bush's Monica Lewinsky
    Tony Blair is George Bush's poodle
    democratic Iraq is George Bush's formidable legacy
    Iraq is George Bush's waterloo
    Hillary is the democratic version of good old George W. Bush
    blue socks are Critics of George W. Bush
    Bruce Bartlett is George W. Bush Bankrupted America
    biggest terrorist is George W. Bush

  56. Re:So someone donated a copy of my copyrighted pag by Philip_the_physicist · · Score: 1

    ITYM "parse", but spelling Nazism aside, they are extracting the ideas from the pages (or at least trying to), not the expression of ideas, so copyright doesn't come into play (IANAL, etc.). This is just an attempt at automating the collation of existing research, and indeed similar ideas have been attempted in the past with smaller data sources, particularly in combination with other work in machine learning.

  57. Tried "How do I enlarge my penis?" by spiralofhope · · Score: 1

    I got zero results. In 500 million pages, this should have been answered 500 million and one times.