Web Log 'Word Bursts' Could Identify New Crazes
Zorgatron writes "New Scientist reports that a researcher from Cornell University has come up with clever method of identifying what's cool by automatically searching weblogs. Sudden increases or "bursts" in the usage of particular words may reflect a new craze, according to Jon Kleinberg. He has demonstrated the technique by searching through state of the union addresses given since 1790." I wonder how long before this can be done real time enough to really make this useful.
Theres another "what's popular on blogs" webpage at Blogdex. It tracks links, showing which pages are most linked to.
daed si luap
Google can do much the same thing, on a real-time basis, by examining what phrases are searched for.
http://www.daypop.com
Its got the top 40 every day. Doing it some other way would only catch memes sooner. And if the system doesn't catch it until its popular, it really doesn't help. What we need is a large and complete database of all meme type things.
The GeekNights podcast is going strong. Listen!
that's "IN SOVIET RUSSIA, PEPSI DRINKS YOU !!!" and "All your sport are belong to us". Just crossing the i's and dotting the t's.
They have a realtime search mechanism that can search within Chat rooms also , and TV and radios streams. (Kevin Kelly is on the Board). Used to be a downloadable personal edition. there is a free trial. Not a plug !!! , they became a corporate (financial and others) company , turning back on "Free Information Now" roots. but at least it works :)
http://www.relegence.com
Sounds like a combination of Google's Zeitgeist and LiveJournal's MemeTracker. In other words, nothing that new.
It's also the basis for Computational Lexicography. Doing analysis on large corpora. One of the interests people have in this field is introduction of new words in society. The field used to use corpora such as the British National Corpus, but since the explosion of the Web, sites such as Google can far exceed that size. Weblogs are simply a good example of a more natural form of language. The interesting thing would be not so much to find new trends through words... but if we can truly solve the whole natural language parsing problem and use such information to extract higher-level knowledge
Data from state of the union addresses here.
Although its not really what the story is about, I always had thought that the Google Zeitgeist was a good indication of "new crazes".