Slashdot Mirror


Computers Summarize the News

oily_ants writes "I get sick and tired of reading the same story on different web sites. That's why I like slashdot so much. Good (??) summaries of all of the stuff out there on the net. Now there is a project at Columbia University by the nlp group that attempts to generate computer summaries of all of those news articles on different web sites. The project is called Newsblaster and the summaries are excellent. You can read about the project on regular news sites like Online Journalism Review or USA Today."

175 comments

  1. Also try... by SlashChick · · Score: 5, Informative

    news.google.com. Just released yesterday. I haven't yet played around with it enough to say whether it's cool or not, but it does look promising.

    1. Re:Also try... by FinnishFlash · · Score: 2, Interesting

      Yesterday ?

      Okay the timezone is different (I'm in europe). But two weeks...

      There you can also easily see how differently different newsagencies report the same story, for example few days ago there was this story about depleted uranium and it being dangerous:

      news Agency #1: new study finds connection between depleted uranium and hightened risk to get cancer.

      news Agency #2: Soldiers exposed to depleted uranium bullets in risk to get cancer.

      news Agency #3: Children in Yugoslavia might get cancer because of NATO's depleted uranium bullets

      Talk about reporting the facts objectively...

      It would have been intresting to see the summary generated from these three... Might have been a bit schizophrenic...

      --
      please proff read !
    2. Re:Also try... by elfkicker · · Score: 4, Interesting

      I've been using that for a couple of months now. (It has been available buried at http://www.google.com/news/newsheadlines.html) I find it very useful. I wish they'd explain exactly how it worked though. Is it all machine parsed? Are the articles listed in order of relevance, time posted, etc?

      I've see a couple occasions where it's had an article on a completely different subject under the header, but it's not the norm. It's always up to date. My only gripe is that it doesn't have an "Older Stories" link. I've gone back to try and find something I've seen before only to find that it had been pushed off.

      They also keep a list of links to news sourse and current relevant resources at http://www.google.com/news/.

    3. Re:Also try... by squaretorus · · Score: 2

      the google service seems to work pretty well - it certainly gets around the main weakness of google that its CRAP for finding information thats less than a week old.

      As they add new sources to the service I reckon it will start to truly rock.

      If I have a criticism its that I have to go to the original site to view the story - all I get at google is the headline. This means I have to wait about a MONTH for a CNN page to load when I want to know some more...

    4. Re:Also try... by Mr+Windows · · Score: 1

      There is a little information available; the headlines are selected automatically, though that's all the info that they're willing to give out at the moment.

    5. Re:Also try... by Eldrik · · Score: 1

      news.altavista.com is a similar service. Been around for a while.

    6. Re:Also try... by Eldrik · · Score: 1
      Blah. I meant news.altavista.com

      That's what I get for forgetting the http part.

    7. Re:Also try... by Masem · · Score: 2
      I don't know how they work exactly, but it's certainly not hard to guess at some of the workings. Assuming they can determine what page hits are news articles and that they can strip the article text only from the body of the page, then one can simply grab words that aren't part of the common langauge, including proper nouns, legal terms, etc. Possibly donote some longer phases, such as "Space Shuttle" or "Artificial Intelligence" as phrases to watch for. This gives each page a set of keywords that make it 'unique' from other news articles on the same site. Then one can argue that articles that are written within the same timeframe (48hrs) and have similar sets of keywords are the same news story from difference sources, and they can group those together as they display now.

      Of course, they might have a completely different formula, or they may have a large body of optimizations they can do, but that's one possibility.

      --
      "Pinky, you've left the lens cap of your mind on again." - P&TB
      "I can see my house from here!" - ST:
    8. Re:Also try... by the+Man+in+Black · · Score: 1

      It would have been intresting to see the summary generated from these three... Might have been a bit schizophrenic...

      Hmmm....

      Soldiers expose children in Yugoslavia to depleted uranium bullets; risk cancer

      Now THAT'S journalism!

    9. Re:Also try... by Anonymous Coward · · Score: 0

      I've also found http://www.daypop.com to be a good news source.

    10. Re:Also try... by Anonymous Coward · · Score: 0

      You don't know the first thing of unscrupulous journalism! That would have to be changed to "Soldiers expose children in Yugoslavia..."

    11. Re:Also try... by steeef · · Score: 1

      damnit. another reason why i really want to work for google.

  2. One More Time! by Anonymous Coward · · Score: 2, Funny

    "I get sick and tired of reading the same story on different web sites"

    So you read Slashdot, where they are happy to post the same story over and over, on different days?

  3. Say what? by turbine216 · · Score: 5, Funny

    I get sick and tired of reading the same story on different web sites. That's why I like slashdot so much.

    I'm sure most will agree with me when I say that this makes ABSOLUTELY NO SENSE.

    1. Re:Say what? by elmegil · · Score: 4, Funny

      Hey, maybe they like reading the same story over and over on the same web site.

      --
      7 November 2006: The day Americans realized corruption and incompetence weren't addressing 11 September 2001
    2. Re:Say what? by quantaman · · Score: 1

      Maybe he's a masochist.

      --
      I stole this Sig
    3. Re:Say what? by Anonymous Coward · · Score: 0

      Reminds me of a commercial for a dating service that I saw a couple years back:

      "My friends thought I would meet losers. I met Tom."

  4. Google kind of does the same thing by mattvd · · Score: 1

    Try Google News Headlines. It automatically finds news stories on the different news sites and groups them togther. From the page you can see the most popular news topics and read several different views of the same story.

    1. Re:Google kind of does the same thing by tomblackwell · · Score: 2

      Wouldn't that kind of go against the part where he said "I get sick and tired of reading the same story on different web sites"?

    2. Re:Google kind of does the same thing by Anonymous Coward · · Score: 1

      My take on it is that the author doesn't like reading all the headlines on cnn.com, then going to abcnews.com and wading through the same headlines to try and find a new story. Then the same thing at the BBC and so on.

      I could be wrong though...

    3. Re:Google kind of does the same thing by ethereal · · Score: 1

      If it's the AP feed that they're all picking up, then sure, it's the same story.

      What's interesting is when the story is different - if CNN says one thing, the BBC says something a little different, and the Times of India has yet a third viewpoint, then it's interesting to speculate on the editorial biases that are leading to such divergent viewpoints.

      --

      Your right to not believe: Americans United for Separation of Church and

  5. Well, there you go by Otter · · Score: 3, Interesting
    Now there is a project at Columbia University by the nlp group that attempts to generate computer summaries of all of those news articles on different web sites.

    Well, there's the answer to the Ask Slashdot from a couple of days ago.

    1. Re:Well, there you go by TrebleJunkie · · Score: 1

      Only problem is, no source code. :(

      Oh well. The search continues.....

      --

      Ed R.Zahurak

      You know, oblivion keeps looking better every day.

  6. Google already does this by CProgrammer98 · · Score: 1
    From google's main page, click "news and resources" to see stories from all the world's major papers.

    As others have pointed out, they've also just launched a beta news summary service.

    --
    And the people shall be oppressed, every one by another, and every one by his neighbour Isaiah 3:5
  7. submitter must not have been here long by RN · · Score: 0
    I get sick and tired of reading the same story on different web sites. That's why I like slashdot so much.

    HA! this guy must be a newbie. you could generate the same slashdot headlines with the bbsport slashdot story generator that provides the typos, misspellings and bad links that you know and love, and you wouldn't know the difference

  8. Now to ask... by Sorthum · · Score: 3, Insightful

    ...whether this will include the obscure stories that are actually interesting, or whether it'll be just a rehash of the major stories that we can find in ten or twelve other places.

    1. Re:Now to ask... by oily_ants · · Score: 3, Informative

      It's just a rehash of all of those other stories. But the nice part about it is it is in reader's digest condensed version. I only have to read one small paragraph to get the major points of the event instead of sifting through a long article that doesn't include much actual information. It is meant as a summary so the information is NOT the obsure stuff (which is interesting) but quick and dirty summaries of important events.

  9. We've been doing that for ages. by almaw · · Score: 2, Interesting

    Our company has been running a similar service for a very long time. It's free, and you canget it here. It's called NewsScape.

    1. Re:We've been doing that for ages. by yasth · · Score: 2, Informative

      No you provide a basic news grouping and ordering service, this sumarizes the articles based off of many different sources. This is sort of like Slate's Today's Papers feature except for articles and not just the days news.

      --
      I'd do something interesting, but my server can't handle a slashdotting.
  10. Newsbot by TheGreenLantern · · Score: 4, Funny

    Sounds like a good idea, but I'm worried about the "Newsbots" objectivity. If I wanted to read a bunch of stories about the latest NVidia GeForce 4 release, 10 reasons more RAM is better, and why you should upgrade your hard drive, I'd just watch TechTV.

    --

    It hurts when I pee.
    1. Re:Newsbot by micromoog · · Score: 2

      So you think getting your news from a single source is somehow more objective than from a dozen competing sources?

    2. Re:Newsbot by Stonehand · · Score: 2

      I think that you two agree, actually -- I read the thread starter's last sentence as a criticism of TechTV given that what he listed (news meant to make you BUY BUY BUY) might instead be referred to as "marketing".

      --
      Only the dead have seen the end of war.
    3. Re:Newsbot by AndyChrist · · Score: 2

      I'm not worried about that any more than I am about a meaty editor's objectivity.

      I'm more interested in whether the thing makes any humorous errors. That and whether it can eventually out-rotten Daily Rotten.

  11. Slashdot's financial problems are SOLVED! by 1nt3lx · · Score: 2, Funny

    Looks like subscriptions can be axed, Slashdot won't need editors anymore!

    Although, it will only be possible to replace slashdot's editors with the newsblaster program if they can implement some sort of misspelling and false information algorithm.

    1. Re:Slashdot's financial problems are SOLVED! by aardvarkjoe · · Score: 2

      I think that this is what you want. A little work interfacing the two so it picks some words and adds links based on a real story, and you'll never need to visit slashdot again.

      --

      How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
  12. Alternative to my.yahoo.com? by markfive · · Score: 1

    I've been using my.yahoo.com for years as my start page. It gives me a good concise and highly configurable view of many news feeds and financial information (stock quotes, etc). Recently, however, it has become overwhelmed my annoying ads.

    Does anyone know of an alternative? The newsfeed in the article looks prommising, but not exactly what I'm looking for.

    1. Re:Alternative to my.yahoo.com? by FireballFreddy · · Score: 1

      I'm a big fan of Yahoo Messenger for this. You can display the top X news items in any categories you want to follow, and present the categories in any order you like. I have top 8 stories by Reuters and AP, followed by science from Reuters and AP, tech by Reuters, AP, CNET, ZDnet, and Reuters Internet Report, entertainment from E! Online and oddly enough from Reuters.

      Also follow stocks, check the weather, see if I have Yahoo e-mail, and chat with the same application. And there's only one minor ad at the top, which I don't notice anymore. I'm not saying you should change from ICQ or AIM or whatever, but Yahoo Messenger is a pretty cool app for keeping up with the world outside your cube. ;)

      -FF

      --
      SQUEAK, the Death of Rats explained.
  13. Do I want one? by Paul+the+Bold · · Score: 1

    Does anybody know how I can get access to Newsblaster so I can read a synopsis of the article to determine whether Newsblaster is something that would interest me?

  14. I think Newshub is better by brunnock · · Score: 3, Interesting

    http://newshub.com/

    1. Re:I think Newshub is better by Anonymous Coward · · Score: 0

      Aside from a personal distaste for anything that puts entertainment news before world news it doesn't seem to have computer generated summaries of the articles and certainly not multi source summaries.

    2. Re:I think Newshub is better by Anonymous Coward · · Score: 0

      New Shubnigaroth (sp?) ?

      HP Lovecraft would be pleased, but it doesn't make much sence...

  15. Impressive by Reality+Master+101 · · Score: 5, Informative

    To tell you the truth, at first I thought the summaries were TOO good; I was suspicious that it wasn't really automated.

    But after looking at a few more stories, it looks like it just pulls sentences out of the stories that seem to have a different point to make, and strings them together.

    Sometimes you see some redundancy and some non-sequiturs, but I have to admit the illusion is pretty good.

    --
    Sometimes it's best to just let stupid people be stupid.
    1. Re:Impressive by clarkgoble · · Score: 1

      News documents seem to summarize better than most sorts of documents. I wrote a general summarizer for the company I work for and was frankly amazed at the output for news stories. (Pretty much like this) When you combine a summarizer with a categorizer (either rule based or a more general statistical categorizer) then you can get some really fantastic results.

      Check out the web page www.lextek.com and I wrote a little simple demo that lets you past any text to summarize. It's a toolkit that pretty much lets you do what these guys do. The demo doesn't use a categorizer for any pre-processing, but it kind of interesting for some projects.

    2. Re:Impressive by rikki_t · · Score: 1

      Heh.

      Here's a story on ... well, I'm not sure. It was listed under "Will Hollywood's 'Tomb' be a box office 'Raider?'" which was news about: Lara Croft, Angelina Jolie, video game, Tomb Raider Chronicles Web, Max Payne.

      Here's the summary:

      After the success of last year ' s Lara Croft: Tomb Raider and high expectations for Resident Evil, out Friday, studios are booting up for more. The game: Alice in Wonderland gets a twisted remake in American McGee ' s Alice, a gothic horror version of the classic tale; based on the game by Electronic Arts Studio: Dimension Status: Horror master Wes Craven directs. The game: A sunglasses-wearing all-American hero blows away bad guys with machine guns; 3D Realms Studio: Dimension Status: In limbo. Star Angelina Jolie was attached to the sequel before the original Tomb Raider opened. The game: Amateur taxi drivers take to the sidewalks and crowded streets, picking up customers and delivering them to their destinations unscathed; Sega Studio: No distributor yet. Brothers Jon and Erich Hoeber(Montana) currently are writing the screenplay.

      And the four articles it used:

      * Jolie, Thornton Say They Are Grateful for New Son (Lycos 03/14/02)
      * Video games get the reel treatment (USA Today 03/14/02)
      * Will Hollywood's 'Tomb' be a box office 'Raider?' (USA Today 03/14/02)
      * Coming soon to a bigger screen (USA Today 03/14/02)

      Hmmmm.

      --
      Any technology which is distinguishable from magic is insufficiently advanced.
  16. breadth vs depth by Ubergrendle · · Score: 5, Insightful

    This is a somewhat dangerous trend, IMHO. CNN Headline news gives us blurbs...soundbites...with no substance. "Israelis shot Palestinians" or vice versa on a daily basis. Little reporting of substance of negotiations; why there was a conflict in that location at that time for what reason. The great thing about the internet is that there is great reporting in depth. I like to check out the Drudge report, BBC, disinfo.com, etc on a regular basis to get a good blend of various points of view so that I can make my OWN opinion. I don't want to be served watered down sentence fragments by a corporate AOL/TimeWarner beheometh. Slashdot is one of a few exceptions to this rule, since they typically link to articles of substance and allow for dialogue and debate by (usually) intelligent users. The moderation system isn't perfect, but it helps dodge the trolls. My guess is that automated summaries will lose the flavour of good journalism/writing, and by taking an "average" will end up with a C+ "factual comprehension" review as opposed to multiple A+ "theory" and "syntehsis" editorials.

    --
    John Maynard Keynes: "When the facts change, I change my mind. What do you do?"
    1. Re:breadth vs depth by Cloud+9 · · Score: 1
      This is a somewhat dangerous trend, IMHO. CNN Headline news gives us blurbs...soundbites...with no substance. "Israelis shot Palestinians" or vice versa on a daily basis.

      Newsblaster doesn't do that. What it does is grab news stories from a selection of different sites, searches through the stories for certain words, phrases, or sentences, and then creates a summary of the story, puts it in a heading under the "hits" it made, and provides the link.

      This actually unseated /. from its throne as my home page. =]

      --
      Karma: Dyn-o-mite!(mostly affected by Jimmy Walker reading your comments)
    2. Re:breadth vs depth by Phrogz · · Score: 2
      I don't want to be served watered down sentence fragments by a corporate AOL/TimeWarner beheometh.

      Sure you do, when you're trying to figure out the gist of the story in overview mode. The problem you mention with CNN HN is because TV is a non-interactive medium, and you can't find out/they can't provide you with more information about the story in question. With this, you can.

      The summarizer has to get its information from somewhere, from the full news story...this is just a way of giving you the executive version (akin to browsing slashdot with a comment threshold to 5), so you can find out the basics before delving into the details.

    3. Re:breadth vs depth by Mr+Windows · · Score: 1
      Fair point, though Newsblaster does link to the original stories so you can read them and check what they actually say if, having read the summary, you're interested.

      Having said that, my preferred source for real-world news is Radio 4 (especially Today and PM), which is especially good for listening to as I'm waking up or going to sleep, apart from the moments when a politician says something so outrageous that my blood boils...

  17. Still Some Work To Be Done... by po8 · · Score: 4, Funny

    Check out this odd story about incarcerated Browns. The summarizer could apparently still use some manual supervision.

  18. Seems nice enough... by Davorama · · Score: 4, Insightful

    So where's the slashbox for it?

    --

    Davo -- Free speech, free software, AND free beer.

  19. Filtered news by moankey · · Score: 1

    The news. or newsblaster sites remind me of the days when the web was just starting to get going. No banners, pics, ads, frames, etc... Just lines and lines of text for one to sift through.

  20. Copyright Infringement - Fair Use Doctrine -NOT! by Anonymous Coward · · Score: 3, Interesting

    This is usually Copyright Infringement - Fair Use Doctrine is not applicable.

    Every one of these paraphrasers lift large chunks of syntax.

    I would maintain that this is still a plagiarsist or copyright violation unless it is done really well.

    And it never will be done really well unless NeuralNetwork chips are common and mankind has advances in Artificial Intelligence research. Five years away at best.

    I dare the commerical services to hit Enyclopedia Britannica. Or I dare them to routinely slurp New York Times and boast that they digest the New York Times..

    A massive Civil Suit is awaiting some of these early adapters planning on creating a business out of this.

    And they deserve it.

    It is just "Word Twiddling", however useful.

    If the twiddling is done live, once, per user client, then maybe its OK, but none of these business models are setup THAT way.

  21. copyright/legality? by ddeboer · · Score: 5, Interesting

    What are the copyright or other legal issues to republishing news stories collected from web sites? The Newsblaster site clearly states where the information comes from - like every good college student is taught to cite information sources. On the other hand, on the bottom of many of the stories is the notice: "Copyright 2002 Associated Press. All rights reserved. This material may not be published, broadcast, rewritten, or redistributed." Is collecting and condensing news stories "republishing" - does this violate copyright stuff?

    1. Re:copyright/legality? by Cloud+9 · · Score: 1
      What are the copyright or other legal issues to republishing news stories collected from web sites?

      Paraphrasing is legal under copyright law, so long as the sources are cited and it's not just a cut-and-paste of the entire selection.

      --
      Karma: Dyn-o-mite!(mostly affected by Jimmy Walker reading your comments)
    2. Re:copyright/legality? by Anonymous Coward · · Score: 0

      Good question, but goes beyond legality, also a question of functionality. I noticed on the content of one page it had grabbed that it had reformatted in a way that made it unclear what was quoted speech (perhaps the original site used blockquote to set off quotes). There was also an invitation to post your comments. Huh? On the original site this was no doubt a link to some sort of reader forum.

      An important part of the web is linkage, in fact there is no web without linkage. This tool strips links, ignores original formatting, and basically does a major disservice to the original content, regardless of whether or not it's actually legal.

    3. Re:copyright/legality? by pjrc · · Score: 2

      They copy the entire article text and redistribute it from their own server, without ads and even the "branding" from the original site. That's a very different animal that slashdot, which just links to the original material at the original site.

      Sure, their main page just briefly quotes, which is probably ok, but all the links point to local copies of the copyrighted news articles.

      (In the USA) there are four criteria for judging what is a fair use of copyrighted material.

      The purpose and character of their use isn't academic or educational, it's a news service just like the original sites they got the text from. The fact that it's hosted from a .edu domain doesn't change anything.

      The amount and substantiality of the portion used in relation to the copyrighted work as a whole is darn close to 100% of the copyrighted material.

      The effect of the use upon the potential market for or the value of the copyrighted work is particularily bad... if people can easily get the news from this convienent summary site, why would they bother to visit the original site (and thus be an audience for their advertising, become "loyal" readers, etc).

      Now the nature of the copyrighted work is informational news, and not really expressive (like songs, movies, etc), so at least they've sort-of got one of the four criteria for fair-use.

  22. Direct NewsBlaster link by Alien54 · · Score: 3, Informative
    The direct link is here:

    www.cs.columbia.edu/nlp/newsblaster/

    although I found some of the summaries slightly shallow, they are not bad.

    The problem is that it becomes an average of opinion, when you sometimes need that longer insightful article. This easily could become the news of sheep everywhere.

    This could be bad when facts come in to contradict initial impressions.

    oops

    --
    "It is a greater offense to steal men's labor, than their clothes"
  23. my God Man by Jonny+Ringo · · Score: 1

    Don't post the competition!!!!

  24. Let computers be computers, humans be humans by HawaiianMayan · · Score: 1

    I think it does an OK job for a computer, but nobody's going to accuse those summaries of being overly coherent (or well-typed).

    Occasionally, the summary will juxtapose two sentences (it's just ripping examplar sentences from different stories), that when put together create screw up the meaning:

    "Now that David Letterman is staying at CBS, ABC s corporate bosses took steps to mend fences with"Nightline"host Ted Koppel on Tuesday. And that ' s appealing to beer companies..."

    Doh!

    I think a more fruitful avenue of research is new methods of presenting information so that humans can decide what to read. Instead of using tricks to simulate a computer understanding the meaning of an article, this uses the same tricks to simply assist reading the article.

    Apple's research group did some interesting work in that area in the 90's.

  25. Used by U.S.A. Today? by Yoda2 · · Score: 2, Funny

    I suspect that U.S.A. today has been using a similar technology for years now to generate their "McNews".

  26. This is old 'news'... by DickPhallus · · Score: 1

    I already get this in my email from other people:

    Hi, I send you this news to ask for your advice!

    Along with all kinds of pertinent documents...

    --

    --
    Some weasel took the cork out of my lunch.
  27. Slashdot has something they don't have by digitalpeer · · Score: 1

    ...500 different opinions on every headline with a little insight here and there. It's already like getting news from different sources.

    1. Re:Slashdot has something they don't have by Anonymous Coward · · Score: 0

      500 different opinions

      Funny, I always thought that the posters on Slashdot always had one opinion, and everyone always agreed on it. Unless you count the opinions of the guys you call "Trolls" which are simply people who didn't subscribe to the opinion of the collective.

    2. Re:Slashdot has something they don't have by EricKrout.com · · Score: 0, Offtopic

      What are you talking about? It's well-known that the comments aren't worth shit according to the people who run this site.

      (Go ahead, fanboys, mod me down. You know it's true.)

      m o n o l i n u x :: If You Don't Click Here, The Terrorists Have Already Won

  28. Dangerous potential in Newsblaster by Anonymous Coward · · Score: 0

    As intriguing a tool as Newsblaster might appear to some it could prove be be a dangerous tool resulting in even more sloppy reporting than we see even today.

    Far too many reporters and editors could be tempted to use in wrecklessly instead of doing true investigative work themselves.

    We already have some excellent news monitoring services to gather news stories from around the world. Most of them have proven themselves trustworthy and also provide us with a clear picture of the sources of these stories so things can quickly be investigated when someone is in doubt about sources and facts.

    The electronic age has been a boon to all types of writers including investigative reporters but let's not go overboard with every new "toy" that is thrown at us as many of us are prone to do.

    There is too much sloppy, inferior reporting being done already.

    Newsblaster does have potential, but if it is used at all should be approached with the same caution a soldier uses picking his way across a mine field -- or a grenade that somebody has already pulled the pin from and handed to us.

  29. Grammar, even by micromoog · · Score: 1, Flamebait

    Wow, this bot's grammar is far better than the human editors at Slashdot!

  30. Princess Margaret == Wallace Simpson ? by Anonymous Coward · · Score: 0

    This is what it said about Princess Margaret dying. For some reason, it decides half way through that Wallace Simpson is the same person!!

    "Princess Margaret led a life largely on the sidelines of the royal family but she also attracted national sympathy when she surrendered love for duty. But as a twice - divorced American, she was considered unsuitable as a wife for a king, and the British government advised Edward not to marry her. Princess Margaret, 71, the younger sister of Britain ' s Queen Elizabeth II who once was adored as a fun - loving child in a fairy - tale castle but who led a romantically troubled and turbulent adult life, died Feb. 9 at a hospital in London after a stroke. Margaret died just three days after Elizabeth marked her 50th year on the throne. Princess Margaret ' s life contained one great contradiction. Her tender touch betrayed their secret, as Margaret attentively brushed fluff from the dashing officer ' s jacket. Until then, Group Captain Peter Townsend and the Princess had been forced to hide their feelings."

  31. Re:Microsoft bashers bah by Anonymous Coward · · Score: 0

    Your not the only one defending Linux. Your the only one defending Linux at a bunch of brainwashed lost causes.

  32. Does it actually generate original content? by Anonymous Coward · · Score: 0

    When I compared the summarization to the original articles, I found all the original sentences (verbatim) in the articles. In other words, Newsblaster doesn't write anything, per se, it decides which sentences are most representative of the set of original articles.

    I mention this only to point out that Newsblaster represents no threat to the livelihood of reporters, whatsoever. Without the reporters, there would *be* no original story.

    Generating brand new sentences is much more difficult than deciding which sentences represent content common to several documents, which Newsblaster does very well.

  33. Oops! by Captain+Large+Face · · Score: 1

    So they grab news from the Washington Post, Reuters and the BBC (amongst others), but leave out the National Enquirer? Why can't I have all my "Space Aliens Abducted Britney Spears" stories in one place?

  34. How does it work? by AndyChrist · · Score: 3, Interesting

    I don't see any concrete information on what it does to summarize stories...is it using something like Cyc? Does it just have some heuristics for picking out the important parts of paragraphs?

    Also, who else thought "neuro-linguistic programming" for at least a moment when they saw "nlp"?

    1. Re:How does it work? by mattbelcher · · Score: 1
      Also, who else thought "neuro-linguistic programming" for at least a moment when they saw "nlp"?

      Not me. I'm more familar with the phrase "neuro-linguistic hacking." Neal Stephenson uses it extensively in his novel, Snow Crash.

      --

      Shockwave Flash movies are the greatest thing to happen to non-sequitur humor since Japan.

  35. Is that really possible? by Spez · · Score: 1

    ... that attempts to generate computer summaries of all of those news articles on different web sites. The project is called Newsblaster and the summaries are excellent Is it relly possible that a computer created an excellent summary of a news article? All the computer generated articles I've read were almost unreadable... Just look at babelfish translation. Is that really english? Hope that NewsBlaster will be much better!

    --
    I wouldn't mind you in my head, if you weren't so clearly mad -Lews Therin Telamon
    1. Re:Is that really possible? by f00zbll · · Score: 1
      I read a few of the summaries and they aren't bad at all. Much better than other attempts in the past. Here is a description of one of their projects. Now I don't know if TIDES is used in newsblaster, but it's still interesting.

      In the TIDES project, we will develop a practical, multilingual and multidocument information tracking and summarization system. Our design features the integration of robust, statistical techniques, shallow linguistic approaches and machine learning to achieve scalability within languages and portability across languages. To realize these goals, we will develop methods for summarization across documents using information fusion and identification of key differences, summarization across languages relying on identification and translation of terms, and new methods for identification, expansion and translation of terms. Unlike most other approaches, rather than relying on sentence extraction, our work uses information fusion of similar information, merging together repetitive phrases into a single phrase allowing dramatic reduction of information across many articles. Our work will focus on characterizing types of differences to include in a summary, which is an unexplored direction in multi-document summarization. We will develop difference operators to identify new information, contradictions, trends, multiple perspectives, and different topics. Our approach will minimize reliance on full machine translation, instead using identification, expansion and translation of terms where possible. We will begin work with a language such as Spanish, but quickly expand to include Asian languages and other non Indo-European languages.

      From the look of it, NLP (natural language parsing) seems to be evolving nicely. It used to be that NLP required processing the entire document and understanding the sentences by mapping heirarchies of valence/word order.

  36. Bold Article by FortKnox · · Score: 0, Offtopic

    Its bold of slashdot to have an article about technology that may put them out of business. Why not go ahead and put up other articles that are just as bold? ;-)

    --
    Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
  37. Still a few bugs in the summarizer by Allen+Varney · · Score: 1

    Here is one borderline-incoherent Newsblaster summary:

    Will Hollywood's 'Tomb' be a box office 'Raider?'

    Summary:

    After the success of last year ' s Lara Croft: Tomb Raider and high expectations for Resident Evil, out Friday, studios are booting up for more. The game: Alice in Wonderland gets a twisted remake in American McGee ' s Alice, a gothic horror version of the classic tale; based on the game by Electronic Arts Studio: Dimension Status: Horror master Wes Craven directs. The game: A sunglasses-wearing all-American hero blows away bad guys with machine guns; 3D Realms Studio: Dimension Status: In limbo. Star Angelina Jolie was attached to the sequel before the original Tomb Raider opened. The game: Amateur taxi drivers take to the sidewalks and crowded streets, picking up customers and delivering them to their destinations unscathed; Sega Studio: No distributor yet. Brothers Jon and Erich Hoeber(Montana) currently are writing the screenplay.

  38. Dude. by Anonymous Coward · · Score: 0

    Last night. Slashback. Jeez.

    -- SlashChick

    1. Re:Dude. by FortKnox · · Score: 1
      • 2002-03-13 17:04:55 Internet Ad-Free Subscription Reviewed (articles,news) (rejected)


      Heh, it wasn't in the initial summary of the slashback (and nothing in the summary interested me).
      Oh well, I'm on a bad roll lately. Its just karma, I guess...
      --
      Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
  39. Evolution by apsio · · Score: 1

    Pretty soon summaries will be one word 'catch-phrases' which convey enough meaning for us to get the gist of the story. We can then 'think' to the computer let me learn more about this. And then we will.

    the future. god bless it.

  40. Sounds like the "DJ 3000" by poot_rootbeer · · Score: 2

    Those. clowns. in. Congress. did. it. again.
    What. a. bunch. of. clowns.

    1. Re:Sounds like the "DJ 3000" by Cloud+9 · · Score: 1
      Those. clowns. in. Congress. did. it. again.
      What. a. bunch. of. clowns.

      Heh heh heh... How does it keep up with the news like that?

      --
      Karma: Dyn-o-mite!(mostly affected by Jimmy Walker reading your comments)
    2. Re:Sounds like the "DJ 3000" by bje2 · · Score: 1

      "Hot Dog! We have a weiner!"

      --

      "Facts are meaningless. You could use facts to prove anything that's even remotely true." - Homer Simpson
  41. newslinx by zykem · · Score: 1

    i check www.newslinx.com on a daily basis. very good source of all-round news.

  42. Irony....on a Friday?? by revery · · Score: 1

    So I need to go to several regular newsites to read the stories about a site that is supposed to make such requirements unnecessary......ok

  43. Where's today's news? by T1girl · · Score: 2

    I want to read today's news today. I can pick yesterday's paper out of the trash can. They're still speculating on how Holy Cross will make out against Kansas. (Hello, the game was last night. Kansas won.)

  44. O'Reilly Network's Meerkat by primal39 · · Score: 1

    The O'Reilly network already has a pretty good resource for news from multiple sites in the Meerkat Wire Service

    --
    Eschew Obfuscation
  45. My newsfeed by British · · Score: 2

    1. yahoo's my start page. I get to see up to 4 possibly-relevant news articles.

    2. Fark. I get my good share of the weird news, and of course NewsFlash articles, which are just links to other news sites. I'm happy.

  46. One benefit... by dallen · · Score: 2

    The article in the Online Journalism Review says: "Newsblaster seems to make things somewhat generic or more conservative, especially when summarizing reports over several days. This can take away the editorial edge or nuance that a reporter or editor might use to make a lead or report powerful. Summarizing news over several days in this approach results in a certain staleness."

    I noticed the "blandness" of the summaries too, but I think that's a benefit-- reading CNN stories can get really tiring after a few minutes since everything has to have as much punch as possible.

  47. Here are some papers by Anonymous Coward · · Score: 2, Informative

    Here are some papers about Newsblaster and computer text summarization in general.

  48. Read the papers by mizhi · · Score: 3, Informative

    Reserach Papers

    I'm not sure if they've done anything really novel. I skimmed through one of the more recent papers, on sentence ordering; but that seem to only operate on the same event There's research like this going one at alot of major universities like CMU and MIT. That said, it does look impressive.

    --
    Humorless sig goes here.
    1. Re:Read the papers by DavidKirkEvans · · Score: 5, Informative

      We have a summarization strategy that selects from three summarizers: one that works over documents describing a "single event" which is novel, one that works over documents describing a person (so-called biography events) using sentence extraction, and one that is a general sentence extractor based on the biographical summarizer which does use more than just TFIDF weighting for the extraction. (It has a notion of semantic classes, and some other stuff.)

      The "single event" summarizer is novel though. It uses a clustering component to cluster the sentences, then for each cluster it takes the intersection of the sentences (yes, we need to parse the text to do this, and we do) and RE-GENERATES (does not extract) a sentence that synthesizes the information from the cluster.

      There's a lot of other stuff going on as well, we're using a text categorization system that we developed here, a text clustering system, our own system for categorizing the images that come with the articles (you'll be able to browse by image categories soon as well) and some other stuff.

    2. Re:Read the papers by mizhi · · Score: 1

      (It has a notion of semantic classes, and some other stuff.)

      This, I have not seen before, but seems to be the next logical step in automagically classifying documents.

      --
      Humorless sig goes here.
  49. If you don't like the same story over and over... by Pvt_Waldo · · Score: 1
    I get sick and tired of reading the same story on different web sites. That's why I like slashdot so much.


    I assume that means you only read the stories here, and not the posts. :^P
  50. Summarizing... by bob_clippy · · Score: 1
    or just interleaving?

    Compare the summary with the opening sentences of the first and last articles. Maybe we should wait a bit before speculating on the business impact of this "technology".

    --

    -- Nobody should take away Microsoft's freedom to innovate, particularly since they haven't used it yet

  51. hmm. by raindog151 · · Score: 1

    i wonder how long until they have a subscription plan.

    someone should make line graphs.

    --
    your jesus is another mans xebu. chew on that hypocrites.
  52. Re:Copyright Infringement - Fair Use Doctrine -NOT by yasth · · Score: 1
    Umm, you seem to have a few problems:
    • Neural Nets need not be on chips to be used, indeed one can (as I have) do them on paper or mentally. Indeed I am unclear on what a NN chip would be, a very fast matrix solver???
    • The legality of summaries is always a vexing question. Since this system generally will take similiar lines of text, a lot of what it takes are quotes to begin with. Also it does cite the articles. It doesn't seem to take long enough stetches to be in violation, though it could.
    • As for a business plan, well it is a univeristy project, and probably not one taken too seriously. They aren't making money off this any way I can see

    I do think this service would be a lot more usefull if it were done so that one didn't have to go to a seperate page for the summary.
    --
    I'd do something interesting, but my server can't handle a slashdotting.
  53. Postmodern, postliterate Americans by Infonaut · · Score: 2
    don't care about "good journalism/writing". In fact, most of us don't care about most of what goes on in the world. Hell, most of us don't even know that there *IS* a world outside the United States, except when we go somewhere to kick some ass.

    This new averaged, filtered, genericized "news" is exactly the kind of crap suited to a society that spawned "Judge Judy" and "A Current Affair". Sure, it's a nice piece of technical wizardry, but all things clever are not useful or worthwhile.

    --
    Read the EFF's Fair Use FAQ
    1. Re:Postmodern, postliterate Americans by TrollBridge · · Score: 0
      "This new averaged, filtered, genericized "news" is exactly the kind of crap suited to a society that spawned "Judge Judy" and "A Current Affair". Sure, it's a nice piece of technical wizardry, but all things clever are not useful or worthwhile."

      You use the words "judge Judy" and "technical wizardry" in the same post? Way to prove your own point, sir.

      --
      There's a Mercedes gap too. I want one and can't afford one, but it's not government's job to do anything about it.
  54. Re:Microsoft bashers bah by Anonymous Coward · · Score: 0

    local Linux users group with their own irc server. Someone help me badmouth these guys, they Windows bash constantly. im sick of being the only one to defend Windows?

  55. Our product does the same thing, and copyright iss by MarkWatson · · Score: 1
    Our product (plug: www.knowledgebooks.com) does much the same thing, except it finds short relevant phrases with some postpocessing performed (e.g., pronoun resolution).

    Our product works by first categorizing text articles, then identifying which phrases most effectly support the categorization of the article.

    subject of copyright infringement: several people have pointed out that the linked site may go beyond fair use of text on the original news sites.

    I bet that the university obtained permission like I did. I sent about 10 news web sites copious documentation on what my system does, and three gave me permission to use their sites. As is usual in life, it helps to ask politely!

    -Mark Watson www.markwatson.com

  56. Wanna watch it crash? by gosand · · Score: 2
    Try and have it summarize a Jon Katz story. :-)

    I have yet to figure out exactly what his point is in a story, so it would be really interesting to see software try and handle it.

    Of course, it did say it summarizes NEWS stories.

    --

    My beliefs do not require that you agree with them.

    1. Re:Wanna watch it crash? by sulli · · Score: 1

      It would achieve the magical 100:1 compression ratio we have all been reading about!

      --

      sulli
      RTFJ.
  57. Let the users vote on the articles by comp.sci · · Score: 1

    There is an easy way to sort out badly-formulated articles. Just let the reades decide if the article is:
    a) interesting
    b) well-written
    c) in the right column

    or similar.
    So the majority of people who visit these sites wont have to worry about "bad-news" :)
    Those articles that get downvoted get off the frontpage. This process repeats whenever an article is added and assures a high-quality.

    comp.sci

  58. related post a few days back by Anonymous Coward · · Score: 0
  59. Oh its good by Carnivore24 · · Score: 0

    I like Slashdot because it keenly catagorizes highly interesting tech news as well as news on movies and T.V. Its also updated casually all day long so you can keep refreshing and something new comes up.

    The most important thing is user comments like these. Its an outlet for tons of intelligent people to post additional insights on the various topics that come up.

    Oh and dont forget CowboyNeal, he is teh leet of them all.

  60. BIas a good thing? by Rocko+Bonaparte · · Score: 1
    Human journalists make connections between facts and between events or stories that can add context to a current report. This kind of contextualization is something that Newsblaster cannot do.
    Now is this necessarily bad? I often times have to hack and slash through bias and opinion to get to the facts. It's time like that I wish I had a little robot to monotonously spit the facts to me. This is a great tool because it carries little opinion about the events, and forces the reader to do all the real thinking. Sure, it only cranks out blurbs, but I find these blurbs better than "3 Israelis Killed" [to 15 Palestinians]* headlines I keep seeing.

    *I'm not trying to take a side in that, but I've recently found American news sources *seem* to skew that kind of news a little.
    --
    No I'm not trolling.
    1. Re:BIas a good thing? by currand60 · · Score: 1

      It was obvious in the Online Journal Review that the author was scrambling to maintain his profession's relevence. In this case the word "context" could be replaced with the word "opinion". Of course that's what makes good journalists great, their opinion.

      Show me a reporter that writes without bias and I'll show you an algorithm that can replace him.

      --
      -dave
  61. A Few Kinks, A few Comments by Irvu · · Score: 3, Interesting

    As the summary here shows there are still a few kinks in the system.

    While I have to agree with some people that this isn't in-depth reporting I do think that it is pretty interesting AI. When it comes down to it the problem is not that a computer might be summarizing our news. The problem is twofold.

    Firstly people are not always inclined to look beyond summaries. When faced with typical time constraints people prefer to look at summaries because they do not have time to search across a dozen sources and articles. This is why USA today became big in the first place. Nothing there is more than 1 column long. (Incidentally did anybody else find it hilarious that this system "summarizes" USA today who themselves summarize other news sources?)

    Secondly much of the news is the same. News is big business and most major news media tell the stories that sell. Because they are all targeting the same markets they tell the same stories and in the same ways. Therefore there is little difference between CNN, the NY Times, etc in terms of tone and "facts". Especially since much of "their news" comes from the same wire services such as Reuters. Fox News is different but that is because they have abandoned the mantle of impartiality and become all conservative all the time.

    In essence this system is perfect for the internet news style. Breif summaries of facts followed by more "in-depth" leads that we may peruse as we wish. The real question is, when will this begin drawing on sites like Indymedia, The Register and /.

  62. I'm afraid to Slashdot a great site, but... by babbage · · Score: 4, Funny
    www.headlinehaikus.com

    Basically, it looks at the headlines on Yahoo/Reuters, and finds sentences that scan as 5/7/5, and uses Perl cleverness to present them as a little news haikus (or senryu, if you wanna be picky). It's great stuff:

    Today:
    Commonwealth Group Blasts Zimbabwe Poll

    but defended by
    separate observer groups
    from South Africa

    Also today:
    Amnesty Charges U.S. Violated Rights of Detainees

    possible suspects
    connected to the attacks
    including their right

    My last birthday (Feb 4):
    Saudi Proposed Saddam Overthrow to US - Prince:

    we agree upon
    the various issues that
    we agree upon

    Christmas, 2001:
    Deep emotion, little joy in Bethlehem Christmas:

    Palestinians
    without the special permits
    very bad this year

    Sept 10 2001:
    Belarus Opposition Demands New Vote, Plans Rally:

    We do not agree
    with the official result
    RE RUN UNLIKELY

    June 1 2001:
    Bridgestone says some Ford Explorers defective:

    I am just here
    to say what needs to be said
    I am not here

    I'm hooked :)

    They have archives going back to the beginning of 2001, with only a few holes (e.g. the days after September 11), and they talk about how they are doing everything. Bonus points: you can have the haiku headlines mailed to you automagically every day. I just hope they have the bandwidth (etc) to withstand Slashdot....

    1. Re:I'm afraid to Slashdot a great site, but... by ntk · · Score: 1

      Ach, I can't resist. If you're playing around with Python 2.2, you can find 5-7-5 "summaries" in plaintext for yourself using
      haiku. Great for summarising spam.

    2. Re:I'm afraid to Slashdot a great site, but... by babbage · · Score: 1

      /me is amazed that "the" ntk replied to his post -- so will this show up in next weeks need to know? :)

    3. Re:I'm afraid to Slashdot a great site, but... by ntk · · Score: 1

      Sadly, its time is already passed: I coded haiku last week, guffawing evilly that no-one would have *anything* like that. And yesterday three people sent in wwww.headlinehaikus.com as a meme.

      It'll all be over by Tuesday.

    4. Re:I'm afraid to Slashdot a great site, but... by sweet+reason · · Score: 1

      headline haikus are clever, but for a real challenge try doing acrostic sonnet headlines!

      --
      Everything should be made as simple as possible, but not simpler. -- A.E.
  63. hogdex by Anonymous Coward · · Score: 0

    For Toronto news - try hogdex

  64. How wrong can you get? by TheAwfulTruth · · Score: 3, Insightful

    Wrong on every count.

    Besides the fact that /. gets is news from other places and is always hours or days late with it. The worst thing you can do is get all your news from one source.

    Every news site has some kind of slant to it. CNN, NPR, /. (And my favorite of your list "USA Today") Sometimes you get more information than contained in a story merely by seeing how different people report the story! Reading one paragraph summaries of the days news will tell you nothing at all. Maybe worse, mislead you due to there not being enough information.

    I read news from about 10 sources a day and if I see multiple articles that I'm not interested in they're easy to skip. If I am intersted in them I read them on all sites. You get much much more information that way.

    Though you do need to pick your sites. If you look at CNN, MSNBC and Salon and all three are merely parroting Reuters then you know your not doing yourself any good.

    --
    Contrary to popular belief, coding is not all free blow-jobs and beer. Those things cost MONEY!
  65. One problem... by Nate+Fox · · Score: 1

    I hope they've done some serious testing on all these news stories. My guess is the summarization of nothing is still nothing :)

  66. What Will Google Do Next? by lazarus · · Score: 0, Offtopic
    WOW! Web, Images, Groups, Directory, and now News. What's next on the web that needs to be categorized? How about:
    • warez.google.com
    • porn.google.com
    • ogg.google.com
    • divx.google.com
    We should have a poll! Ah crap, I guess they probably won't do any of that. Instead we'll probably get:
    • recipes.google.com
    --
    I am not interested in articles about life extension advancements.
    1. Re:What Will Google Do Next? by Jason+Levine · · Score: 4, Informative

      And don't forget http://catalogs.google.com/ for online searching of mail-order catalogs. (They scan 'em, OCR 'em, and make 'em searchable.)

      --
      My sci-fi novel, Ghost Thief, is now available from Amazon.com.
    2. Re:What Will Google Do Next? by butch812 · · Score: 0

      There is also catalogue.google.com

    3. Re:What Will Google Do Next? by ichimunki · · Score: 1

      Yeah, but will they make 'em somehow work with all those CueCats everyone has laying around?

      --
      I do not have a signature
  67. LOL by Kappelmeister · · Score: 1
    Heh. I'm at Columbia now and taking a class in NLP taught by one of the people behind Newsblaster. We had a lecture about summarization (yes, it is just careful selection of source sentences) and got a tour of Newsblaster as a case study. Now it's on Slashdot. oily_ants, what did you think of yesterday's midterm? Happy spring break!

    This is doubly funny because last spring I took a class in computer graphics by the man behind augmented reality, which is again on the front page today because of the street sign article.

    This means:
    • A lot of my classmates read Slashdot,
    • Sooner of later the whole university will be too slashdotted for me to be able to finish my classes and graduate, and
    • I've got to start submitting all of my professors' projects, no matter how obscure. I hear my Databases professor got X working under Linux.


  68. And we`ll be ready by MrFredBloggs · · Score: 1

    for when super-intelligent machines take over the world. Then they`ll be making the news AND reporting it. Perhaps we should build in such an interface now - perhaps they`ll be easier on us then...

  69. another site... by martin · · Score: 2



    www.newsnow.co.uk does similar stuff I guess, but is the summary builder the thing here?

  70. Drop "News On" by GeekLife.com · · Score: 1

    Really, is there a need to preface every subject with "News on"? It reminds me of the Friends episode descriptions that all start "The one about"...

  71. Newsblaster still has problems... by Anonymous Coward · · Score: 0

    This example, is particularly amusing, as it conflates stories on Jim Brown and H. Rap Brown.

  72. Another good news site by uglomera · · Score: 2, Informative

    Check out newsseer It was written by the same people who wrote citeseer, the great research index.

  73. Orders of Magnitude faster!!! by Anonymous Coward · · Score: 0

    What the hell does simulating a NeuralNet chip on a standard CPU have to do with your boast "do them on paper or mentally" ?!?!?!

    Thats insane talk.

    One cluster of Neural Chips would outperform a pentium by so much speed that the comparison is almost moot if productive work (scanning hundreds of articles an hour) is needed.

    Neural Nets do indeed need to be on chips for sentient AI to paraphrase the material in a way 100% guranteed not to infringe upon Copyright.

    mankind has no such AI rogramming talent at this time without a NeuralNet AI.

    I am talking about sentient thought here. To prevent any "Word Twiddling" naysayers from pestering a news digesting company into court day after day until bankrupt.

    YOu think you are so smart? Write a program that paraphrases the Copyrighted scores to a NBA basketball game while in play and rebroadcasts them onto the internet. NBA claims the score itself is copyrighted user experience. One solution is to hint that a team just gained couple points or is behind a certain amount since the last scaore was announced.

    I know this is stretching it, but 8 measures of musical notes can be copyrighted, or a sound clip can be trademark-copyprotected such as a Harely Engine.

    Paraphrasing CORRECTLY to prevent lawsuits is tricky and needs intelligence to avoid legal repercussions.

    So I repeat.... What the hell does a pen-and-paper simulation of a NeuralNet have to do with the fact that REAL Nearl chips would be needed to simulate the human mind in near real time?

  74. US centric by Anonymous Coward · · Score: 0, Insightful

    As expected, the content it's presenting is predominantly US-centric. I'll be giving it a miss until they start scraping from a more globally representative pool of media sources...

    1. Re:US centric by 0xbaadf00d · · Score: 0
      Assuming that you're not one of the Ultra-Rich who would profit from this culling, I guess your just an ignorant tool...

      War Is Peace, Ignorance Is Strength, Freedom Is Slavery

  75. Thanks by PW2 · · Score: 1

    Now I have more info to pipe to my Betabrite sign.

  76. Re:Copyright Infringement - Fair Use Doctrine -NOT by LinuxParanoid · · Score: 2

    Why would fair use not be applicable?

    Slurping a sentence or two from an 5-25 paragraph article and quoting it with attribution is considered fair use, right?

    I'm not clear on if they're quoting and attributing it sufficiently to meet a legal challenge however. IANAL. But it's not the open and shut case you make it out to be as far as I can tell.

    --LP

  77. uh... this could be bad.. by Cinnibar+CP · · Score: 2, Insightful

    What if we have two such automated news services and they scan each other? Wouldn't they get stuck in some sort of infinite loop where they repeatedly pass the same story back and forth, summarizing it over and over again?

    1. Re:uh... this could be bad.. by 0xbaadf00d · · Score: 0
      Isn't the what The Media already does?

      War Is Peace, Ignorance Is Strength, Freedom Is Slavery

  78. Tried the new google news search service yet? by skunkeh · · Score: 2, Interesting
    It's still in beta but it's already pretty impressive:

    http://news.google.com/

    It indexes a huge array of news sites several times a day for fresh stories - enter a search term and it will bring up all the headlines it can find for that subject. Best of all, it uses an algorithm to identify alternative coverage of any one story and lists these links in a block beneath the main search results. That way you get links to several different accounts of the same story (although in practise they end up being pretty similar due to using the same news agencys) without having to hunt around for them yourselves.

    They're still working on the algorithm and are requesting as much feedback as possible - read more here.

  79. But What About... by dscottj · · Score: 1
    All the cogent, well-reasoned arguments the slashdot community gives us? All the first-class debate? All the goatsex?!?


    And don't forget, sometimes the slash editors like a story so much they'll post it twice so we don't miss it! Can't beat that service with a stick!

    --
    AMCGLTD.COM. Where cats, science fictio
  80. Hello -- copyright issues? by drDugan · · Score: 2

    Their sorce pages look like verbatim copies from the other sites -- clearly a copyright issue, no?

  81. Now it just needs SOAP by jfsather · · Score: 3, Interesting

    I think this would be pretty cool if they could add some sort of a SOAP/XML-RPC type interface where you could query on sections, stories, whatever. It would be nice to allow content syncing like this.

    I was writing about this in response to a post in a user's journal the other day that even better would be to make a story content P2P system where you could allow story distribution. You might place a limit and only allow the summary to drive people to your site, but it could still help with bandwidth issues. This would basically be like an enhanced RDF/RSS type system but over a P2P type network you wouldn't even really have to host your own feeds for people. Add in some sort of DB persistance and you could just say "get new headlines and summaries from site x"--the system would bring in all the new content. Anyway, that is just a dream I have and probably will never happen the way some people feel about their content.

  82. Biased computer? by Anonymous Coward · · Score: 0

    A summary of the latest developments in the Israel-Palestine conflict:

    Israeli forces penetrated the heart of the Palestinians ' most important city Wednesday as Prime Minister Ariel Sharon rejected myriad calls for restraint and pressed ahead with his nation ' s largest military operation in three decades. Facing international criticism, Israeli officials said the army was waging a war of self-defense against a Palestinian terrorist onslaught that has killed 340 Israelis since September 2000, including about 60 this month. Israeli military sources confirmed the attack in Gaza as Israeli forces pressed on with a sweeping offensive against the Palestinians in which they have occupied the West Bank city of Ramallah, Palestinian President Yasser Arafat ' s main power base. Israeli troops took up positions in schools and homes and conducted house-to-house searches in the Al Amari refugee camp for a second day.


    Does it annoy anyone else that it mentions 340 murdered Israelis but not 1180 murdered Palestinians since the start of the new intifada?

    Maybe this is represtative for US media?

    1. Re:Biased computer? by 0xbaadf00d · · Score: 0
      Don't you know what time it is boy-eeee!

      It's 1984!

      War Is Peace, Ignorance Is Strength, Freedom Is Slavery

  83. Re:Copyright Infringement - Fair Use Doctrine -NOT by gblues · · Score: 2

    The original poster is correct. This is essentially automated plagiarism. Here's why.

    The service claims to be a computerized summary. However, in terms of copyright, a summary is something that expresses the same idea using different words. Therefore, using exact quotes and labelling them as a summary is a textbook case of plagiarism.

    Nathan

  84. Re:Copyright Infringement - Fair Use Doctrine -NOT by LinuxParanoid · · Score: 3, Insightful

    I think you are confusing plagarism, and a violation of copyright. I am primarily concerned with the legal issue of Copyright violation raised by the previous poster, not an amorphous ethical one.

    As Bitlaw points out, under the Copyright Act, four factors are to be considered in order to determine whether a specific action is to be considered a "fair use." These factors are as follows:

    1) the purpose and character of the use, including whether such use is of commercial nature or is for nonprofit educational purposes;
    2) the nature of the copyrighted work;
    3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
    4) the effect of the use upon the potential market for or value of the copyrighted work.

  85. Re:Copyright Infringement - Fair Use Doctrine -NOT by LinuxParanoid · · Score: 3, Insightful

    Dang, clicked the submit button by mistake.

    Attempting to apply the four factors there, while some could be argued either way, I can see that on balance, you both might be right. I could probably make a stronger case that it doesn't qualify as fair use, than that it does, based on those four factors. I think I was focusing over-much on the "amount taken" criteria and overlooking the others.

    --LP

  86. customized, fresh news: fyuze by Anonymous Coward · · Score: 0

    if you're looking to compile your own customized view of the day's news, you might want to check out fyuze.

    it will let you select sites that you want to see news from, organize them, and filter them by time and keyword. you can also rate stories so that in the future, it will be possible to do story recommendations. it's like an uber-my.yahoo

  87. Already done - Newshub by PeterMiller · · Score: 2, Informative

    I've been using Newshub for 2 years now, does essentially the same thing.

    newshub.com

  88. Re:Copyright Infringement - Fair Use Doctrine -NOT by gblues · · Score: 2

    I am pretty sure that plagiarism is a de facto copyright violation. You are using another author's words without proper credit. Even if the author is cited in the "summary," the definition of a summary is that the words are the authors, not the cited person's words. If they are going to use the author's exact words, it needs to be labeled as an "auto-quote" generator, rather than a summary.

    Nathan

  89. The real magic isn't in the summaries... by tibbetts · · Score: 2, Insightful

    ...but rather in identifying multiple documents that appear to be talking about the same thing. Summarization is a well-researched (but not well-perfected) NLP topic, but finding inter-document similarities is quite a bit more challenging. This is easy for me and you to do when we read something, but think about what it takes to get a machine to do this. Take a look at some of the examples--you'll find that although large chunks may be verbatim from document to document (especially ones that rehash standard news feeds like Reuters and AP), most articles have a different wording or spin on each idea.

    --
    :wq
  90. Even better project by matt_king · · Score: 1, Informative

    Check out the Center For Intelligent Information Retrieval (UMASS) CIIR for their project on Topic Detection and Tracking (TDT). Not only does this categorize(assign topics to) news stories as they break, but it attempts to automatically group stories together as they break. I worked for them this summer (on a different project), and these are some really brilliant guys and girls!

  91. Some corps. do this manually by Stultsinator · · Score: 1

    Actually, a lot of large corporations do this so that their employees' time is spent efficiently. They employ a few cut-n-pasters who create repositories of stories that can be browsed from either their intranet or (at IBM) VMS accounts.

    It sounds like this project could bring that service to the masses.

  92. News: NewsBlaster Blasted by Anonymous Coward · · Score: 0

    NewsBlaster today got blasted by something which is summarized as the 0.0 divided by ... DIVISION BY NULL, core dumped.

  93. Computers stink at this kind of thing by oakbox · · Score: 1

    Way back in the stone-ages (1994), IBM was trying to build a news-reader called 'infoSage'. After waffling and not doing a very good job for many months, they finally threw up their hands and said, "Can't do it".
    Or rather, "Can't do it well enough to charge for it." Even now, 6 years later, I can't see this happening. The net is just too big, and natural language parsing is too obfuscated, for a computer system to do what it needs to do in this area. XML, (and self-describing data in general) looked like a step in the right direction, but it ultimately relies on a human being properly defining just what the hell the data IS.
    I think that in the short term (and I'm not going to put a date on this, because I'm not THAT smart) our best connection with news content on the web is going to be Google (which would mean that you would have to know what you are looking for in the first place) and topic-specific sites like Slashdot, Meerkat, etc.
    Just my two centavos.
    FIRST perfect language parsing, THEN have computers try to sift through the universe for the stories. Until then, too much noise to trust a machine.

    - oakbox

    --
    Not just answers, the correct questions.
  94. Re:ATTENTION FBI by Anonymous Coward · · Score: 0

    while we're at it, i've got plans to take over the world, also, and i'm sure that violates all sorts of laws. Yay. Isn't the internet fun?

  95. What you talking 'bout Willis by Anonymous Coward · · Score: 0

    What do you mean when you say "OCR 'em"?

    1. Re:What you talking 'bout Willis by 3141 · · Score: 1

      OCR = Optical Character Recognition. This usually refers to scanning in a document, and automatically converting the resulting image file into text.

  96. Slashdot isn't news by mlg9000 · · Score: 1

    There may be a few news articles discussed here but this is no news site. This is more of a rumor mill / science forum / freak show site for the technically inclined. If you are looking a really good NEWS ONLY site I'd take a look at the Drudge Report, www.drudgereport.com. It's probably the simplest and most no frill site out there but you can always get the lastest news before anywhere else (even TV) and news you don't see anywhere else. I've made one of my regulars.

  97. Another website by Jack9 · · Score: 1

    Another website that pre-dated this attempt, was Jack9.org - too bad @home went down or I could prove it with a link :(

    Throughout it's 2 year lifetime it was parsing Slashdot.org, Fark.com, Clinko.com, Technocrat.net, isonews.com, techdirt.com, and kuro5hin.org.

    Since the site is down (probably for a loooong time) this is not a promotion, just a fact.

    --

    Often wrong but never in doubt.
    I am Jack9.
    Everyone knows me.
  98. Infobreakfast by 3trunk · · Score: 1

    Another site to look at is infobreakfast. This site summarizes the news in 10 words or less, with a nice clean interface to present it. News content is much like Wired - techie stuff with a little general and world news thrown in. Updated every hour.

  99. Re:klerck doesn't summarize PWP by Anonymous Coward · · Score: 0

    Klerck says he doesn't live there anymore: http://slashdot.org/comments.pl?sid=29531&cid=3176 529

  100. Re:ATTENTION FBI by Anonymous Coward · · Score: 0

    You sir, are an idiot.