Google Abandons Plan To Archive World's Newspapers
An anonymous reader writes "Throughout the past few years, Google's newspaper-scanning project has digitized more than 60 million pages from newspapers spanning 250 years, including such gems as the moon landing. But according to the Boston Phoenix, this ambitious effort is slated to soon be axed in favor of Google One Pass, a platform for publishers to monetize content from their own sites."
Google wants to help others make money (and make a little themselves) with one of their projects?
Forget Google search, I'm going with BING instead. Microsoft would never do this!
So what happens when some of those publications inevitably go out of business? We lose all of their works forever? I would hope that google could come up with some sort of middle ground. Why not continue the archival process, but allow companies still in business to choose what content is free, and what requires a fee or ads? There has to be a way that the companies can profit while still protecting us from losing the information permanently...
Some people will complain, but this was inevitable. Business-wise, it's silly to throw this much opportunity into the "free" sphere. Rupert Murdoch was right about this; "Content isn't King, it's Emperor ". If content is your business, then giving it all away is a great way to go broke fast. Ad revenue will only go so far. If it's good enough, people will pay for content.
Life is hard, and the world is cruel
Google won't exist forever, what happens to the data then?
They should be taking lessons from the Stanford LOCKSS project.
is it because there are just too many?
-- It is the mark of an educated mind to be able to entertain a thought without accepting it. -- Aristotle
I think people may missread Google's intent. It sounds more like they are giving up an advertising based model that would allow people to view the archived newsprint for free, and instead opting to allow the newspapers themselves to set up subscription charges to read back copies of their newspapers, even if those newspapers are now considered to be public domain. Don't be evil may be their motto, but they are a big corporation now, and the bottom line determines their actions.
If it works.
The real problem being is that those in charge of making decisions at the newspapers don't have the desire to go along with this sort of change. And let's not be coy about it. They need to change and we need them to as well. All of us need good, solid reporting of all sorts.
So far, these sorts of changes have been happening right along without them. This is yet another stop on the grand train to the digital future which most of them are willing to ignore in hopes of something else happening.
Let's see if this time they'll get their collective heads out of the sand.
"including such gems as the moon landing. "
Wrong link, it's the politically correct one, this is the real one.
http://store.theonion.com/product/holy-shit-man-walks-on-fucking-moon-1969,158/
In addition to being archived by the newspaper company, most local newspapers are already archived by local libraries as microfiche/microfilm. This is often required by law, as public notices are required to be placed in newspaper, and a record of this must be kept. Important national newspapers are archived by the Library of Congress, as well as multiple other libraries, where they are also digitized.
This is where Google got their source data to scan/upload in the first place.
I liked it when part of google's plan was to organize the worlds information and make it accessible and free, just as a way to drive more ad traffic. I hope OnePass isn't like the NYtimes firewall, which was basically no fun. - www.awkwardengineer.com
I've always found that reading through older articles, that one retains more of the facts as they unfolded versus today's news reporting which seems to be mostly heavily biased towards sensationalism and less facts.
But then again maybe that's how reporting always took place, and I only remember the good ole days because the actual fact is my memory is selective in how I remember those days.
Life takes interesting turns, but the most interest is when you're off the beaten path.
In addition to being archived by the newspaper company, most local newspapers are already archived by local libraries as microfiche/microfilm.
The value added by Google was that the text has been OCRed and indexed.
People seem to make the mistake in thinking Google is about something other than leveraging search to make money. But Google has been about the BUSINESS of search at least since its IPO. Holding some illusion that Google is altruistic is just fantasy.
If you want news from today, you have to come back tomorrow.
Information sponsored by companies who have a genuine interest in adding to the historical record is preserved
History is written by the victors
Given that attitude, no software would survive from companies that have gone out of business, since there is nothing of commercial
value there.
How does google back up their data. Or even do they? Or is it just a massive farm of machines with the data spread across redundantly?
If google collected, OCR'd and indexed everything they can get their hands on.
If the price was set at the original issue price.
If I owned/had access forever.
If they had some workable payment system.
I thought it was really neat when old magazines and newspapers started showing up in some of my searches. It would often be worth buying a copy of a paper when researching things. Or at least having the option of doing so.
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
Short sighted? no. Not when you remember what Google's products are and who Google's customers are. Google is listening to its customers to deliver the product the customers demand.
Namely the advertisers are Googles customers. The product that Google delivers is the consumer of its services, whether it be through Google search, Google mail, or Chrome. All of its software products are geared towards deliver its product (you) to its customer (the advertiser). In this case, the customers have proposed a different approach that delivers a faster return on Google's investment dollars.
I'm glad they got what they did. Reading some of the 1860's newspapers is really cool.
She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
Indeed, some parts of the then freshly reported past (or perhaps, some lies too) are not quite compatible with recent `insights' about history. So, difficultying its consultation, by un-centralizing it, is very well in accord with present practices (read Al Jazeera as for the movies, Le Monde and the like for the pamphlets, and last but not least Wikipedia: the name says it all). I should add also Wikileaks because all is not quite clear there. One should dig for his own self, `social interactions tend to lessen the wisdom of crowd effect'.
See my comment to the parent. Half a dozen global vendors have already digitized and full-text indexed decades if not centuries of newspapers (depending on the importance and history of the paper). No matter how good OCR gets, it won't surpass the work already done by real humans doing data entry, transcription, and correcting OCR by hand.
The value added by Google is potentially offering this incredibly expensive effort of preserving information for free.
...for scanning all the issues of Popular Science. I love reading those old issues, following the development of science through these magazines. If it weren't for Google, I could have never gotten access to them.
Perhaps if I lived in the US I could find them in some way, but impossible from Europe. Google made it possible for me, and for this I am truly grateful.
"The agriculture ministry is not in charge of Gundam" - Japanese ministry official.
If they wanted to, they could keep all the archived materials, and eventually offer them back to the source of the articles, for a price, thereby avoiding let's say the new york times from having to go through all the already scanned documents on their own time as google has already done so. This could provide google with extra revenue back from all the invested man hours for this failure, but also allow them to save some companies money where they would want to be part of this new technology for digitizing their columns, but without all the hassle.
The Sentinel was the morning paper. It merged with the Journal evening paper in 1995 and became the "Journal-Sentinel"
So will Recaptcha finally start using english words again instead of the garbage we've had ever since they started scanning this stuff?