Slashdot Mirror


Web Log 'Word Bursts' Could Identify New Crazes

Zorgatron writes "New Scientist reports that a researcher from Cornell University has come up with clever method of identifying what's cool by automatically searching weblogs. Sudden increases or "bursts" in the usage of particular words may reflect a new craze, according to Jon Kleinberg. He has demonstrated the technique by searching through state of the union addresses given since 1790." I wonder how long before this can be done real time enough to really make this useful.

9 of 239 comments (clear)

  1. Blogdex by nob · · Score: 4, Informative

    Theres another "what's popular on blogs" webpage at Blogdex. It tracks links, showing which pages are most linked to.

    --
    daed si luap
  2. Google by Citizen+of+Earth · · Score: 3, Informative

    Google can do much the same thing, on a real-time basis, by examining what phrases are searched for.

    1. Re:Google by ccweigle · · Score: 4, Informative
      Google can do much the same thing, on a real-time basis, by examining what phrases are searched for.

      And they do that much already ... on their Zeitgeist page: http://google.com/zeitgeist

      But this is different. The article is about monitoring the blogs, not the searches. As suggested in another comment, this may be related to Google's acquisition of Blogger.

  3. Daypop by Apreche · · Score: 4, Informative

    http://www.daypop.com

    Its got the top 40 every day. Doing it some other way would only catch memes sooner. And if the system doesn't catch it until its popular, it really doesn't help. What we need is a large and complete database of all meme type things.

    --
    The GeekNights podcast is going strong. Listen!
  4. Re:Great.... by Anonymous Coward · · Score: 1, Informative

    that's "IN SOVIET RUSSIA, PEPSI DRINKS YOU !!!" and "All your sport are belong to us". Just crossing the i's and dotting the t's.

  5. Relegence (~eNow) already does this in realtime by lieutenant · · Score: 2, Informative

    They have a realtime search mechanism that can search within Chat rooms also , and TV and radios streams. (Kevin Kelly is on the Board). Used to be a downloadable personal edition. there is a free trial. Not a plug !!! , they became a corporate (financial and others) company , turning back on "Free Information Now" roots. but at least it works :)
    http://www.relegence.com

  6. Zeitgeist and Memes by mrmiasma · · Score: 3, Informative

    Sounds like a combination of Google's Zeitgeist and LiveJournal's MemeTracker. In other words, nothing that new.

    It's also the basis for Computational Lexicography. Doing analysis on large corpora. One of the interests people have in this field is introduction of new words in society. The field used to use corpora such as the British National Corpus, but since the explosion of the Web, sites such as Google can far exceed that size. Weblogs are simply a good example of a more natural form of language. The interesting thing would be not so much to find new trends through words... but if we can truly solve the whole natural language parsing problem and use such information to extract higher-level knowledge

  7. Paper is here by Isamu+Noguchi · · Score: 2, Informative
    J. Kleinberg. Bursty and Hierarchical Structure in Streams. Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002.


    Data from state of the union addresses here.

  8. Re:Google? by zeno_2 · · Score: 2, Informative

    Although its not really what the story is about, I always had thought that the Google Zeitgeist was a good indication of "new crazes".