Slashdot Mirror


Google News to Host Wire Service Stories

knhasan writes to tell us that Google has just announced a new program in which they will host wire news stories directly on their site. This is widely believed to be the first concrete fallout from recent troubles with Agence France Presse (who sued Google for alleged copyright infringement) among other wire services. "The new feature unveiled Friday is called 'duplicate detection,' which lets Google News identify the original source of a story that may appear in tens or hundreds of news outlet Web sites. If the source story is from one of the four news service agencies that Google has licensing agreements with, Google will display the story on a page that it hosts."

5 of 63 comments (clear)

  1. It's a good thing by Evets · · Score: 3, Interesting

    From a consumer standpoint, I really like this move.

    It seems to be completely random which site a given story will point to and there are times when I click through to a news item and I'm immediately skeptical of the source site. If a news vendor isn't doing any sort of value-add, I don't see why I should get sent to bob's scraped wire site versus a trusted major news source.

    1. Re:It's a good thing by martin-boundary · · Score: 2, Interesting
      I'd mod you up if I had points, that's spot on. The "upstream sources" are syndicated wire news feeds and columns, and the downstream media (that includes "trusted major sources" like the NY Times) pick and choose the bits of a given story that they want to show, and rewrite it for effect and desired story size.

      It's actually very easy for Google engineers to identify the sources, because they _have_ all the possible source texts at their disposal: anybody who subscribes to AFP/Reuters/AP/etc obtains the raw sources. Then it's just a matter of writing a program which computes the percentage of sentence overlap between a downstream story and each of the possible raw stories - with a precomputed index it's very simple: anybody can do this with open source indexing tools and open source text analysis tools on freshmeat.

      The case of images is both easier and more difficult: in most cases, the images on a news story are syndicated and used "as is", so a simple pixel comparison gives a match to the syndicated source, but sometimes the image is cropped or modified, and then it's difficult to identify, but not necessarily impossible.

      Unfortunately, it's not possible to infer the bias of the nation wide (or world wide) reading population with your method, as the article collection system used by Google introduces a publication bias itself, which swamps the natural proportions. All you get is a non uniform sampling of news sources, which at best tells you the possible extremes of viewpoint assuming they reflect their reader base, but that's still interesting.

  2. I'm hoping for better reporting. by khasim · · Score: 2, Interesting

    So, there won't be duplicates.

    Which means that in order to attract people to YOUR news site, you'll have to ADD something. Either background research, interviews, commentary, etc.

    Sure, the commentary might not be "better". It will probably still be biased. But the facts should appear more consistently now.

  3. Re:All your newsbase are belong to us by westlake · · Score: 4, Interesting
    Checkmate. Google owns the news. Game over for your local paper. I predicted this many years ago.

    The game begins for your local paper.

    The Niagara Falls Reporter is a free tabloid that efficiently - and hilariously - extinguished the career of the most corrupt and incompetent mayor this border town has known in living memory.

    It succeeds by relying on a minimal staff, reporting and opinion with strong local roots - in John Hanchette, for example, it has a founding editor of USA Today,a former editor of the Niagara Gazette and a man with a Pulitzer to his credit and a national reputation as a journalist and teacher.

  4. Whoever modded this down is an idiot by symbolset · · Score: 2, Interesting

    The newspaper I was working for when I predicted this is still available at its vestigal domain name here where I helped set it up.

    At the end of a meeting to review a very expensive (>100K$) demographic survey in 1992, I spoke my mind. I told him a number of things, including that the toxic ink on dead tree business model wouldn't last forever, that communities were more important that forums, that the Internet wouldn't be male dominated forever and that user generated content was more important than expert generated content. He thought I was a flake. It cost me my job to tell him what I really thought, and I was right. It cost him >100K to hear what the demographer thought he wanted to hear.

    I don't regret it at all. He was an idiot too and he deserved to miss out on the .boom billions he could have had.

    --
    Help stamp out iliturcy.