Slashdot Mirror


Computers Summarize the News

oily_ants writes "I get sick and tired of reading the same story on different web sites. That's why I like slashdot so much. Good (??) summaries of all of the stuff out there on the net. Now there is a project at Columbia University by the nlp group that attempts to generate computer summaries of all of those news articles on different web sites. The project is called Newsblaster and the summaries are excellent. You can read about the project on regular news sites like Online Journalism Review or USA Today."

12 of 175 comments (clear)

  1. Also try... by SlashChick · · Score: 5, Informative

    news.google.com. Just released yesterday. I haven't yet played around with it enough to say whether it's cool or not, but it does look promising.

  2. Impressive by Reality+Master+101 · · Score: 5, Informative

    To tell you the truth, at first I thought the summaries were TOO good; I was suspicious that it wasn't really automated.

    But after looking at a few more stories, it looks like it just pulls sentences out of the stories that seem to have a different point to make, and strings them together.

    Sometimes you see some redundancy and some non-sequiturs, but I have to admit the illusion is pretty good.

    --
    Sometimes it's best to just let stupid people be stupid.
  3. Re:Now to ask... by oily_ants · · Score: 3, Informative

    It's just a rehash of all of those other stories. But the nice part about it is it is in reader's digest condensed version. I only have to read one small paragraph to get the major points of the event instead of sifting through a long article that doesn't include much actual information. It is meant as a summary so the information is NOT the obsure stuff (which is interesting) but quick and dirty summaries of important events.

  4. Direct NewsBlaster link by Alien54 · · Score: 3, Informative
    The direct link is here:

    www.cs.columbia.edu/nlp/newsblaster/

    although I found some of the summaries slightly shallow, they are not bad.

    The problem is that it becomes an average of opinion, when you sometimes need that longer insightful article. This easily could become the news of sheep everywhere.

    This could be bad when facts come in to contradict initial impressions.

    oops

    --
    "It is a greater offense to steal men's labor, than their clothes"
  5. Re:We've been doing that for ages. by yasth · · Score: 2, Informative

    No you provide a basic news grouping and ordering service, this sumarizes the articles based off of many different sources. This is sort of like Slate's Today's Papers feature except for articles and not just the days news.

    --
    I'd do something interesting, but my server can't handle a slashdotting.
  6. Here are some papers by Anonymous Coward · · Score: 2, Informative

    Here are some papers about Newsblaster and computer text summarization in general.

  7. Read the papers by mizhi · · Score: 3, Informative

    Reserach Papers

    I'm not sure if they've done anything really novel. I skimmed through one of the more recent papers, on sentence ordering; but that seem to only operate on the same event There's research like this going one at alot of major universities like CMU and MIT. That said, it does look impressive.

    --
    Humorless sig goes here.
    1. Re:Read the papers by DavidKirkEvans · · Score: 5, Informative

      We have a summarization strategy that selects from three summarizers: one that works over documents describing a "single event" which is novel, one that works over documents describing a person (so-called biography events) using sentence extraction, and one that is a general sentence extractor based on the biographical summarizer which does use more than just TFIDF weighting for the extraction. (It has a notion of semantic classes, and some other stuff.)

      The "single event" summarizer is novel though. It uses a clustering component to cluster the sentences, then for each cluster it takes the intersection of the sentences (yes, we need to parse the text to do this, and we do) and RE-GENERATES (does not extract) a sentence that synthesizes the information from the cluster.

      There's a lot of other stuff going on as well, we're using a text categorization system that we developed here, a text clustering system, our own system for categorizing the images that come with the articles (you'll be able to browse by image categories soon as well) and some other stuff.

  8. Another good news site by uglomera · · Score: 2, Informative

    Check out newsseer It was written by the same people who wrote citeseer, the great research index.

  9. Re:What Will Google Do Next? by Jason+Levine · · Score: 4, Informative

    And don't forget http://catalogs.google.com/ for online searching of mail-order catalogs. (They scan 'em, OCR 'em, and make 'em searchable.)

    --
    My sci-fi novel, Ghost Thief, is now available from Amazon.com.
  10. Already done - Newshub by PeterMiller · · Score: 2, Informative

    I've been using Newshub for 2 years now, does essentially the same thing.

    newshub.com

  11. Even better project by matt_king · · Score: 1, Informative

    Check out the Center For Intelligent Information Retrieval (UMASS) CIIR for their project on Topic Detection and Tracking (TDT). Not only does this categorize(assign topics to) news stories as they break, but it attempts to automatically group stories together as they break. I worked for them this summer (on a different project), and these are some really brilliant guys and girls!