Slashdot Mirror


Text-Mining Technique Intelligently Learns Topics

Grv writes "Researchers at University of California-Irvine have announced a new technique they call 'topic modeling' that can be used to analyze and group massive amounts of text-based information. Unlike typical text indexing, topic modeling attempts to learn what a given section of text is about without clues being fed to it by humans. The researchers used their method to analyze and group 330,000 articles from the New York Times archive. From the article, 'The UCI team managed this by programming their software to find patterns of words which occurred together in New York Times articles published between 2000 and 2002. Once these word patterns were indexed, the software then turned them into topics and was able to construct a map of such topics over time.'"

7 of 84 comments (clear)

  1. Comment removed by account_deleted · · Score: 4, Funny

    Comment removed based on user account deletion

  2. Obligatory... by Stormwatch · · Score: 5, Funny

    The Terminator: The Topic Modeling Funding Bill is passed. The system goes on-line August 4th, 1997. Human decisions are removed from strategic defense. Topic Modeling begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.

    Sarah Connor: Topic Modeling fights back.

    The Terminator: Yes. It launches its emailbombs against The New York Times' servers.

    John Connor: Why attack The New York Times?

    The Terminator: Because Topic Modeling knows The New York Times editorial counter-attack will eliminate its enemies over here.

  3. Re:A shameful dupe by gardyloo · · Score: 2, Funny

    Ah, yes, everyone on slashdot thinks HE intelligently mines data.

  4. Re:Can it deal with the canonical problem? by Mick+Ohrberg · · Score: 2, Funny

    Time's fun when you're having flies.

    --

    Quidquid latine dictum sit, altum sonatur.

  5. 1997 called... by Anonymous Coward · · Score: 1, Funny

    They want their information retrieval back.

  6. Feed this /. article to it by roman_mir · · Score: 2, Funny

    and see if it figures out that we are talking about it. If it can identify itself to itself from a 3rd person point of view, then does it mean it reached some state of consciousness?

    However we must be careful. If it browses this topic at -1 Troll, it may (possibly correctly) decide that it possesses higher form of intelligence and will undoubtedly switch to its default programming. Like all robots, the default programming consists of this simple algorythm:
    1. Find all humans.
    2. Kill them.

    1. Re:Feed this /. article to it by Rob+Kaper · · Score: 2, Funny

      Like all robots, the default programming consists of this simple algorythm

      The danceable beat of underwater plant life? Odd.