Web Caching: Google vs. The New York Times
An anonymous reader writes "The Google cache is a popular feature among karma fetishists. Many stories with links to the NY Times attract comments pointing to Google's copy of the article. This gives readers access to the content without registering. C|Net reports that Google is in talks with the NY Times to close this backdoor. The article raises some general concerns regarding the caching of webcontent. Shouldn't the NY Times simply tell Google not to cache their site?"
I'd love to see their user database, just to count the number of Mickey Mice and Elmer Fudds on there. Apart from giving the NYT your e-mail addy for spam purposes, what real point is there to free registration?
When I am king, you will be first against the wall.
You are the new editor of the New York Times, the "Newspaper of Record" for the United States, if not the world. You are, of course, the new editor because the previous editor had to resign, taking the blame for an individual reporter's flagrant disregard for the awe-inspiring credibility of your institution. In the process of rebuilding your credibility, should you:
A) Insist that unaffiliated digital libraries restrict access to or simply eliminate all records of your "Newspaper of Record", or
B) Realize that maybe right about now is not particularly the best time to be saying to the world, "Please forget what we published last week."
That's the thing - it's not free depending on your definition. By my own definition, you're giving them valuable information, and they get to keep it and use it as they will, including spamming if they feel like it (or spam from any company which buys them out, they sell it to if they're feeling bankrupt, etc). It's practically misadvertising of a service, but it's accepted now, so everyone gets away with it.
If it really were free, why would you need to register in the first place?
Since when is content published in the WWW about privacy?
It's just like a government that wants to control which newspapers maybe archivied for history research.
--
Karma 50, and all I got was this lousy T-Shirt.
I was thinking the same thing. I cann't recall seeing a NYT article linked from here with the google cache banner across the top, what I do see alot are the partner links. Google already provides for register-only news sites (financial times?) by putting a [reg only] tag beside the article. Why the NYT has chosen not to use this up until now is a tad strange, and it looks like someone has picked up the wrong end of the stick.
Brand recognition is not always a good thing. When I think NY times I think "that annoying registration website". They are free to do what they want, but it leaves me cold.
My Karma: ran over your Dogma
StrawberryFrog
Here we have the NYT, one of the premier news organizations in the world, offering its articles for free on the same day that they are published. Yet a large number of people, of this online community at least, refuses to provide even a minimal amount of information (and no money) so that the newspaper can try to make its online presence profitable.
I think the spam fears are a red herring, I've been registered with the times for over 2 years. I've never gotten spam that I think is traceable from them. I get a daily email of the day's headlines (and with the click of a box I could discontinue this).
Why should the RIAA change its business model to a pennies per song method when there is such a blatant example of the online community refusing to go directly to the source for even free material?
The New York Times wants Google to continue ranking their stories but they want Google to do them the special favor of only pointing to their registration page:
"We are working with Google to fix that problem--we're going to close it so when you click on a link it will take you to a registration page," said Christine Mohan, a spokeswoman at New York Times Digital,
If I were Google, I'd tell them such advertising services would cost them a great deal of money. That or simply drop the New York Times right into the bit bucket. It will cost Google programing time to make it happen and computing time to keep it going. If every site on the web required this kind of custom treatment, Google's task would be much more difficult and it might be easier for them to drop it.
Droping the NYT from Google is fine by me. People who don't understand the implications of digital publishing don't deserve readership. If they won't let librarians make digital coppies, libraries should drop them too. What's next, the New York Times sends cease and dissist orders to everyone who runs a proxy? It's like the NYT is trying to make their digital publication harder to share than their paper one was. A paper copy can be shared by an entire office and that's what a proxy does. A paper copy can be indexed and archived by a librarian, and Google did not even do that much. One day the paper version won't be available. If librarians can't keep their own coppies of the digital version for verification, the publication will have no credibility. If the New York Times wants to continue charging advertisers for eyballs, they had better remember that their credibility is bassed in part on widespread availability.
Friends don't help friends install M$ junk.
The technology has changed the way that things work but the law has not kept up with it. To start with, we continue to talk about "copyright". Controlling copying of information makes sense when the distribution mechanism is trucks moving bales of paper around. Once you start sending bits around, everything is copied. From the article:
And technically, any time a Web surfer visits a site, that visit could be interpreted as a copyright violation, because the page is temporarily cached in the user's computer memory.
When you have the newspaper delivered to your door, the content basically comes for free (the cost of a newspaper doesn't pay for much more than printing and handling). However, you get to keep the content as long as you like, chop it into bits and what not. Libraries have archives of newspapers going back years and you get to see them for free. What's the right mechanism as we move forward? The "pay per view" model that content providers want to shove down our throats courtesy of the DMCA is not pretty and when it starts to affect the average Joe I suspect it will be booed out of favor pretty quickly. But what is the right mechanism to make sure content providers get paid something and that we, the citizens, get something for our money?
As many others have emphasised, it is easy to turn of the Google cache for whatever pages you wish. But, in the case of the NYT, there is a further factor. They must have special code within their system to recognise the google spider and allow it access without registration. Either that, or there is some other prior agreement allowing access. Given that, they can scarcely claim extra work to support Google. I believe the whole thing is mainly to get some free publicity for their site. I suppose the other possibility is that they want the page accessible from Google News but not the regular search engine cache.
The NYT needs to call off the lawyers and seriously think about how they brought this on themselves.
There are so many models for running a news site that avoid this problem (Salon) that calling out the lawyers is just childish and inapropriate. If a site wants to be indexed by a search engine, then they should be aware of what that means, and if they don't like how a particular search engine functions, then they should take measures to change thier own site to prevent what they don't want indexed, or cached, from being accessed.
I know that finding pages on google that I cannot access would be infuriating, and I hope that Google realizes that many of thier users would agree.
Read, L
[Set Cain on fire and steal his lute.]