Web Caching: Google vs. The New York Times

Free registration by Zog+The+Undeniable · 2003-07-13 22:00 · Score: 5, Insightful

I'd love to see their user database, just to count the number of Mickey Mice and Elmer Fudds on there. Apart from giving the NYT your e-mail addy for spam purposes, what real point is there to free registration?

--
When I am king, you will be first against the wall.

Re:Free registration by presroi · 2003-07-13 22:06 · Score: 5, Insightful

Maybe we can agree that the NYT is a well-written, serious and interesting newspaper. Not just for New Yorkers but also for people from Sweden, Japan or New Jersey.

Where would the the limit? How would you feel if you had to register for every web page which is linked to at /. (I confess, I usually click on every /.-story link)?

hmm, to answer your question:
maybe the point in registration is the signing of a contract how to use this contact. Dunno.
Re:Free registration by Anonymous Coward · 2003-07-13 22:41 · Score: 5, Insightful

And on top of everything else, it annoys users more than just about anything else aside from spam. Can't recall exactly how many other people I know who go to see a NYT article, find the rego page, and ignore it to go find a better news source without the hassle.

If they're tracking what their users are do, they're affecting their user pool in a pretty negative way just by using this method.
Re:Free registration by MartinB · 2003-07-13 22:53 · Score: 2, Insightful

Their database has got a 98 year old woman who lives in Albania with a PhD, no job and an income of less than a thousand bucks a year.

And you wonder why you get ads that have absolutely no interest for you? And why advertisers have to shout lounder and louder to get through a mass of untargeted ads?

Advertisers would far rather spend less by buying fewer, smaller ad slots that are targeted accurately. Much better return on their spend. Like the guru said I know half my advertising is wasted. I wish I knew which half.

--
The only thing you can accurately describe as "Scotch" is a sticky tape made by 3M. And it's
Re:Free registration by Anonymous Coward · 2003-07-14 00:54 · Score: 2, Insightful

Maybe we can agree that the NYT is a well-written, serious

bullshit. Were you paying attention last month when they had to correct all the Jayson Blair stories, or acknowledge that he copied from other sources. Don't say it was one bad apple either. They haven't apologized for front-page stories claiming Iraq was a quagmire after 1 week of invading.
Re:Free registration by pcmills · 2003-07-14 01:06 · Score: 1, Insightful

More of a cya than anything else.

--
Ask Slashdot - google for stupid people.
Re:Free registration by mrd_yaddayadda · 2003-07-14 01:25 · Score: 3, Insightful

You can count me as one of the people that ignore the NYT unless I can get a cached page. I get enough spam as it is...
Re:Free registration by hesiod · 2003-07-14 01:27 · Score: 2, Insightful

> They can then target ads to users much more effectively

How about they advertise according to the content of the article. If it's a tech article they show tech adverts. That's pretty simple, and something they generally don't do (and it wouldn't require logging in)
Re:Free registration by Anonymous Coward · 2003-07-14 01:30 · Score: 1, Insightful

So, regarding 1, 2, and 3 - the advantage to me as a consumer is what? Absolutely fsck all. Reading cached webpages is no different to picking up a used copy of the newspaper off a park bench. The 'paper can get free exposure to a potential new paying customer due to this, so they should actually be pretty grateful to Google (and the park bench).
Re:Free registration by FatAlb3rt · 2003-07-14 01:32 · Score: 4, Insightful

I disagree. Let's imagine for a minute that everyone provides an accurate profile, targeted marketing works, sales increase, and the advertiser gets rich.

You really think that the money they spend on advertising will level off?
Re:Free registration by NexusTw1n · 2003-07-14 01:33 · Score: 5, Insightful

I always find it ironic when people on slashdot complain about being "tracked" on NYTimes webpages or other sites that require registration.

Most people have registered to use /. , and have therefore provided a valid email address. So you can't have a moral objection to giving your email addy to websites you frequent.

Even if you don't register, your IP address is logged and monitored , via the sophisticated anti troll system. Try and post more than 10 times in one day as an AC, or post as an AC in reply to a post you modded and slashcode will react.

So even as an AC you aren't really totally anonymous on slashdot, yet I don't see anyone who complains about NY Times links complaining about that. The only people who complain are the trolls that forced these features to be added to the code.

So why do we have this tedious bitching about the NY times every time a link is posted?

I registered a couple of years ago. I've never recieved a single spam to NYTimes@mydomain.com which was the email addy I used. I've never had to login because the login cookie has remained in Opera since I registered. How hard is it login and then forget about it forever more?

The only reason I haven't forgotten I've registered is the continual complaints on slashdot from people who are obsessed with privacy on the net unless karma is involved. NY Times doesn't spam registered users, and any user tracking is less sophisticated than slashcode's vital anti troll features. So bear that in mind when tommorrow's NY Times story appears and the same old complaints are dragged out yet again.

--
It has become appallingly obvious that our technology has exceeded our humanity. --Albert Einstein
Re:Free registration by LilMikey · 2003-07-14 01:36 · Score: 3, Insightful

NYT does not let you access their content without logging in. That's nothing like Slashdot's system.

--
LilMikey.com... I'll stop doing it when you sto
Re:Free registration by endoboy · 2003-07-14 01:42 · Score: 3, Insightful

NYT does not let you access their content without logging in
and why should they? NYT spent real $$$ to develop that content, and are under no obligation to give it away.
Don't like it? Go someplace else.
Re:Free registration by fucksl4shd0t · 2003-07-14 03:03 · Score: 3, Insightful

So you can't have a moral objection to giving your email addy to websites you frequent.
It's about trust, actually. Morality has nothing to do with it. I don't trust NYT not to sell my email address or anything like that. I *do* trust slashdot not, but if I ever catch them doing it, well, I just won't tell them it's changed recently. :)
There are quite a few sites that I frequent that I don't trust with personal information. Visiting a site frequently != trust.

--
Like what I said? You might like my music
Re:Free registration by Becquerel · 2003-07-14 04:51 · Score: 1, Insightful

So why do we have this tedious bitching about the NY times every time a link is posted?

Because lets face it, in jokes (cowboyneal poll option,in soviet russia...,1.xyz 2.????? 3.profit, etc) are funny.

--
My spelling isn't bad, I'm evolving the language

Google - more useless everyday by jkrise · 2003-07-13 22:05 · Score: 2, Insightful

IANTrolling here, but I find Google more and more useless by the day. Sometime back, I pointed out how Google seems to have a soft corner for articles and sites that affect big firms such as Microsoft.

In fact, several of Slashdot's own articles on Microsoft aren't available from Google news, although Slashdot is listed as a 'news' source. Couple of MS related Slashdot articles (on the Oregon bill - March 6th and May) have been removed, but much pro-MS content pre-dating March is still referenced.

Google seems to be aping the other Gorilla, despite all the posturing, and Microsoft's so-called attempts to categorise it as a competitor, when in fact, Google appears to be an ally!

--
If you keep throwing chairs, one day you'll break windows....

The reason by Apreche · 2003-07-13 22:11 · Score: 3, Insightful

The reason that the NYT just doesn't tell google not to cache them is visitors. Let's face it, even though the registration is a bitch the content on the NYT website is fairly decent. They have good articles often enough that geeks went through the effort of finding out how to read without registering. If they have google not cache them, and they close the google news loophole, then they wont appear on google news any longer. And google news is used by many more people than you think.

Hey, we get quite a few visitors from this google news. Let's change it so we get 0 visitors from it.

Duh.

--
The GeekNights podcast is going strong. Listen!

hmm by jaemark · 2003-07-13 22:12 · Score: 3, Insightful

the nytimes website needs google for the traffic google brings into their pages, so they can't turn away their spiders. but then, they don't want the spiders either because of copyright violations. why should this be google's problem anyway?

Re:Yes. by Zocalo · 2003-07-13 22:15 · Score: 2, Insightful

Actually, the link to "robots.txt" raises an interesting point. Why is NY Times even in "discussions" for this, other than to gain some column inches? It's entirely upto the NYT whether to let Google's robots to index their site, isn't it? I would have thought that Google's robots would be well behaved in this respect and simply move onto the next site if they were told to go away by robots.txt.

--
UNIX? They're not even circumcised! Savages!

Test Question by Effugas · 2003-07-13 22:17 · Score: 4, Insightful

You are the new editor of the New York Times, the "Newspaper of Record" for the United States, if not the world. You are, of course, the new editor because the previous editor had to resign, taking the blame for an individual reporter's flagrant disregard for the awe-inspiring credibility of your institution. In the process of rebuilding your credibility, should you:

A) Insist that unaffiliated digital libraries restrict access to or simply eliminate all records of your "Newspaper of Record", or
B) Realize that maybe right about now is not particularly the best time to be saying to the world, "Please forget what we published last week."

Ask questions first, shoot later. (if needed) by daBass · 2003-07-13 22:22 · Score: 2, Insightful

Well, I guess that NYT (and many others) allowing Google News to login and index their content means that they like them doing that for getting traffic. For whatever reason, NYT wants you to register and they have a right to as well as they have copyright, allowing Google to put in the snippet, but not the whole article without their consent.

And that is the reason for an index, to find the original.

It is good to see they are working this out together, though, without NYT going to court as the first step. This is a far better way than the popular shoot-first-ask-questions-later attitide most media companies have...

There's no such thing as free registration by pslam · 2003-07-13 22:24 · Score: 5, Insightful

Apart from giving the NYT your e-mail addy for spam purposes, what real point is there to free registration?

That's the thing - it's not free depending on your definition. By my own definition, you're giving them valuable information, and they get to keep it and use it as they will, including spamming if they feel like it (or spam from any company which buys them out, they sell it to if they're feeling bankrupt, etc). It's practically misadvertising of a service, but it's accepted now, so everyone gets away with it.

If it really were free, why would you need to register in the first place?

Re:NY Times likes accuracy by anshil · 2003-07-13 22:28 · Score: 4, Insightful

Since when is content published in the WWW about privacy?

It's just like a government that wants to control which newspapers maybe archivied for history research.

--

--
Karma 50, and all I got was this lousy T-Shirt.

Re:Erm...cache? by Neophytus · 2003-07-13 22:32 · Score: 5, Insightful

I was thinking the same thing. I cann't recall seeing a NYT article linked from here with the google cache banner across the top, what I do see alot are the partner links. Google already provides for register-only news sites (financial times?) by putting a [reg only] tag beside the article. Why the NYT has chosen not to use this up until now is a tad strange, and it looks like someone has picked up the wrong end of the stick.

Shouldn't someone simply tell the NY Times: no reg by StrawberryFrog · 2003-07-13 22:35 · Score: 4, Insightful

Brand recognition is not always a good thing. When I think NY times I think "that annoying registration website". They are free to do what they want, but it leaves me cold.

--

My Karma: ran over your Dogma
StrawberryFrog

Free registration and the RIAA by mike_mgo · 2003-07-13 22:36 · Score: 5, Insightful

It's articles like this that make me think that the recording and movie industries are right to go after online piracy with everything they've got.

Here we have the NYT, one of the premier news organizations in the world, offering its articles for free on the same day that they are published. Yet a large number of people, of this online community at least, refuses to provide even a minimal amount of information (and no money) so that the newspaper can try to make its online presence profitable.

I think the spam fears are a red herring, I've been registered with the times for over 2 years. I've never gotten spam that I think is traceable from them. I get a daily email of the day's headlines (and with the click of a box I could discontinue this).

Why should the RIAA change its business model to a pennies per song method when there is such a blatant example of the online community refusing to go directly to the source for even free material?

Re:Free registration and the RIAA by swilver · 2003-07-13 23:46 · Score: 2, Insightful

The problem I really have with even a free registration is that it is yet another hoop I have to jump through for content that is also available (albeit in a maybe slightly different form) at other sites which donot have these policies. A thousand other news sites are willing to serve me their news without the registration hoop -- I really don't see why the NYT is any more special. As for their image: I think of them as the News site that is too damn stubborn to drop the registration and just display the articles like all the other sites do.

Registration imho is just silly. Since nobody fills out such registrations with any real information anyway (it gets tiring after the first dozen forms orso) the information is probably so wrong you might as well be anonymous. If you are assuming the information is bogus anyway, why not put a cookie on their machine with a unique number (I have no problem with that (yet), as it doesn't annoy me or cost me any extra time) and use that to track that user's actions. You can find out quite a lot that way (seeing what articles he/she likes, how they navigate the site, approximately where they come from, etc..) This would be MORE information than they are getting from me now, which is none -- I'm sure the other sites are doing this already just looking at the huge lists of cookies on my machine.

--Swilver
Re:Free registration and the RIAA by BigBadBri · 2003-07-14 02:00 · Score: 2, Insightful

Why should the NYT make a profit from its online presence?
By posting their stories online, they are able to attract paid advertising, gain public recognition for their dead-tree product, garner goodwill (intangible, but still added in whenever a business is valued) and generally build a brand.
My point is that the online NYT should be regarded as a marketing expense, not as a moneyspinner. We've all seen the grandiose dreams and foolish business plans of the dot-commers fade to dust, so perhaps it's time to reevaluate what an online presence for a newspaper actually amounts to.
Registration is a pain, and it does stop people from reading NYT online content and being exposed to the advertising embedded in that content.
What would be interesting would be an analysis of the registration details so far provided, split into 'valid' and 'ludicrous' categories. This might give a measure of the true value of registration, which I am sure is lower than the NYT believe it to be.
Maybe your query about the RIAA's business model is a valid one - I would prefer to pay a fair price for CDs and see Robbie Williams et al less enriched. Without the CD price cartels, online copyright infringement would be much less significant, and more importantly, lower priced CDs would be more attractive as items of discretionary spending in times of economic anxiety, making the dramatic collapses in sales that we have seen much less likely to occur in future.

--
oh brave new world, that has such people in it!

Tech savvy at CNet? by nacturation · 2003-07-13 22:37 · Score: 3, Insightful

From the article:

Practically speaking, Web sites can "opt out," or include code in their pages that bars Google from caching the page. A tag to exclude "robots" such as "www.nytimes.com/robots.txt" or "NOARCHIVE" typically does the job.

First of all, robots.txt is not a "tag", it's a file. NOARCHIVE is a tag, but it exists within robots.txt, not instead of it as the "or" conjunction would have the unwashed masses believe. Granted, journalists aren't all that tech savvy and are just likely regurgitating a bastardized version based on sketchy notes. But for a supposed tech-oriented site, this kind of reporting is deplorable.

--
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.

Re:I like it by nacturation · 2003-07-13 22:42 · Score: 3, Insightful

the facts are a commercial company A (google) are making a profit from unauthorised copying of other peoples content without permission , meaning company B (you) has to spend money (webmaster) or take proactive steps to remove your content from their databases, google are not an ISP or a goverment agency so really they have no buisness in taking without asking other peoples content.

I don't know what planet you're on, but I profit when my site is listed in Google. People spend an inordinate amount of time and money to make sure their site is listed in the best way possible. Are you going to exclude what could possibly be a huge source of revenue for you? But maybe you have some obscure site you don't want anybody to be able to search for. So, given the amount of time it takes to build even the simplest site, is it really that much trouble to upload a robots.txt file with noindex, noarchive, nofollow in it?

--
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.

Re:Might not be all bad... by joeykiller · 2003-07-13 22:42 · Score: 2, Insightful

As the poster mentioned, Google already has a way to opt out of caching

Yes, Google has this. But for a couple of years I've had the opinion that it actually should be reverse: Sites should be able to opt in, not out. The default should be no cache versions.

Lately there's been discussions here on Slashdot about fair use. about 30 second clips of music on the net, and thumbnails of images being fair use. I can agree that that's fair use of content.

But think about Google's cache: A page in Google's cache isn't a part of - or a summary of content - but it is the entire content of a page. If this isn't breach of copyright, I don't know what is.

Google's cache gives more food for thought as well: Let's say I wrote something about someone on my web site, and this person sued me. A jugde decides against me and gives me a fine, and orders me to remove the content. But even if I do so, the inflammatory words would still be accessible trough Google's cache.

Now, some of you may argue that I could just write Google and ask them to remove the page. But the point is that if this is legal, just about anyone can cache my site. If enough search engines caches content, I most probably would never be able to find every site that provided cached versions of my site.

I'm not sure as to what constitutes fair use of content in the US, but in my country at least (Norway) I'm almost certain that Googles cache mechanism would be judged a breach of copyright laws.

It is all ignored anyway by Anonymous Coward · 2003-07-13 22:59 · Score: 1, Insightful

"And you wonder why you get ads that have absolutely no interest for you? And why advertisers have to shout lounder and louder to get through a mass of untargeted ads?""

What ads? I ignore or block such ads out of principle. Maybe if they provided ads for something worthwhile (instead of "shock the monkey" deceptive scam links), they would not be ignored. Maybe instead of shouting gibberish louder and louder, they should provide good ads for worthwhile products and services.

Re:Yes. by gibodean · 2003-07-13 23:23 · Score: 2, Insightful

Actually, from the text of the article, they say that they want it so that when you click on a link in google, you get the registration page of the NYT.

A robots.txt would stop google from indexing the site altogether. They don't want that to happen. They want a google search to show NYT web pages, but they just want to make sure that when the user tries to view it, they have to register with NYT first. That means that google must still index the page, but not allow access through the cache. Plus, it must direct to a sign-on page rather than the page itself, but that is something that I'm sure the NYT itself could handle, like it think it does now anyway.

That's not what they want. by twitter · 2003-07-13 23:36 · Score: 4, Insightful

Sure, that robots.txt should keep robots out of the entire NYT site. That's not how Google works, though. Google get's their rankings for the NYT from other sites that point too the NYT. I imagine they only archive a page when it reaches sufficient rank. This way, Google would never have to crawl though the NYT site. We can be sure that Google would be happy to drop NYT points and caches if they were asked to do that.

The New York Times wants Google to continue ranking their stories but they want Google to do them the special favor of only pointing to their registration page:

"We are working with Google to fix that problem--we're going to close it so when you click on a link it will take you to a registration page," said Christine Mohan, a spokeswoman at New York Times Digital,

If I were Google, I'd tell them such advertising services would cost them a great deal of money. That or simply drop the New York Times right into the bit bucket. It will cost Google programing time to make it happen and computing time to keep it going. If every site on the web required this kind of custom treatment, Google's task would be much more difficult and it might be easier for them to drop it.

Droping the NYT from Google is fine by me. People who don't understand the implications of digital publishing don't deserve readership. If they won't let librarians make digital coppies, libraries should drop them too. What's next, the New York Times sends cease and dissist orders to everyone who runs a proxy? It's like the NYT is trying to make their digital publication harder to share than their paper one was. A paper copy can be shared by an entire office and that's what a proxy does. A paper copy can be indexed and archived by a librarian, and Google did not even do that much. One day the paper version won't be available. If librarians can't keep their own coppies of the digital version for verification, the publication will have no credibility. If the New York Times wants to continue charging advertisers for eyballs, they had better remember that their credibility is bassed in part on widespread availability.

--

Friends don't help friends install M$ junk.

Our basic copyright assumptions are wrong by putaro · 2003-07-13 23:46 · Score: 4, Insightful

The technology has changed the way that things work but the law has not kept up with it. To start with, we continue to talk about "copyright". Controlling copying of information makes sense when the distribution mechanism is trucks moving bales of paper around. Once you start sending bits around, everything is copied. From the article:

And technically, any time a Web surfer visits a site, that visit could be interpreted as a copyright violation, because the page is temporarily cached in the user's computer memory.

When you have the newspaper delivered to your door, the content basically comes for free (the cost of a newspaper doesn't pay for much more than printing and handling). However, you get to keep the content as long as you like, chop it into bits and what not. Libraries have archives of newspapers going back years and you get to see them for free. What's the right mechanism as we move forward? The "pay per view" model that content providers want to shove down our throats courtesy of the DMCA is not pretty and when it starts to affect the average Joe I suspect it will be booed out of favor pretty quickly. But what is the right mechanism to make sure content providers get paid something and that we, the citizens, get something for our money?

Re:Anyone above this post hasn't read the article. by Anonymous+Brave+Guy · 2003-07-13 23:58 · Score: 2, Insightful

At any rate, cache-ing is an important force on the internet, ...

It is? In this sense? We managed without it being mainstream quite happily until a year or two back.

...and isn't one that should be limited in any legal way, including litigation.

In your opinion. Others have different opinions. We have a legal system to resolve differences of opinion. Go figure. :-)

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Surely just to increase exposure by Mostly+a+lurker · 2003-07-14 00:00 · Score: 4, Insightful

As many others have emphasised, it is easy to turn of the Google cache for whatever pages you wish. But, in the case of the NYT, there is a further factor. They must have special code within their system to recognise the google spider and allow it access without registration. Either that, or there is some other prior agreement allowing access. Given that, they can scarcely claim extra work to support Google. I believe the whole thing is mainly to get some free publicity for their site. I suppose the other possibility is that they want the page accessible from Google News but not the regular search engine cache.

No pity for the NYT... by qtp · 2003-07-14 00:01 · Score: 5, Insightful

The NYT needs to call off the lawyers and seriously think about how they brought this on themselves.

There are so many models for running a news site that avoid this problem (Salon) that calling out the lawyers is just childish and inapropriate. If a site wants to be indexed by a search engine, then they should be aware of what that means, and if they don't like how a particular search engine functions, then they should take measures to change thier own site to prevent what they don't want indexed, or cached, from being accessed.

I know that finding pages on google that I cannot access would be infuriating, and I hope that Google realizes that many of thier users would agree.

--
Read, L

The Web and the Internet - Sad, sad, sad by miu · 2003-07-14 01:11 · Score: 4, Insightful

I had to laugh seeing this little gem attached to the story:

Special Report
The Google gods
Does the search engine's power threaten the Web's independence?

The Web's independence? The fucking web is a sad little microcosm of the real world. Google is one of the few reasons I can still stand the web, and silly statements like "Google is making copies of all the Web sites they index and they're not asking permission" are the reason the web sucks so bad. When everyone is deathly afraid of being sued or prosecuted for something it's no wonder that the web is such a clown town of worthless crap.

--

[Set Cain on fire and steal his lute.]

"My eyes are open." by StarFace · 2003-07-14 02:17 · Score: 2, Insightful

Or, you can just step out of the consumer-corporation mind jerk entirely and live your life the way you wish to live it, and not the way the banner/side-of-bus/television/et cetara tells you how to live it. Me, I live without all of these things and I seem to be doing just fine, quite happy, actually. And I could care less about most of these companies you are refering to, the ones that I do care about, they get my financial support in return for services, with or without their million dollar advertising campaigns (which I never see, anyway.)

So which is the real real world? The one where you spend the afternoon on your porch reading a book to your mate, or the one where you sit in front of a television and "reap the rewards" of advertising, so you can buy more stuff, presumably?

I am not saying my world is universally better than your world, but it is just as real.

--
V

Easy way to solve this problem... by ninejaguar · 2003-07-14 02:29 · Score: 3, Insightful

Every time a cached link is clicked, pay sites like the New York Times can receive notice from Google (easy to automate this) that one of their pages (which is cached in Google) has been accessed, and all advertisements in the cache have been displayed (Google caches Ads in the page as well as the contents). This allows the website to "offload" traffic and at the same time keeping the books on the number of times their Ads have been viewed so that they can send the accounting record to their paid Advertisers.

Google would find this very simple to implement, and paid sites would find this very beneficial (borrowing Google's enormous bandwidth and server capabilities for free) and at the same time should solve most of their concerns. After all, Google's cache isn't sufficient for proper access to ALL the paid-content at the New York Times as the cache is temporary in nature. Also, its too spotty in coverage to be considered reliable enough for really digging into a paid-sites entire content.

Using Google like this is akin to using Google as a window into the pay-site's house of content. You can part of a room, but not the whole interior. Now, every time someone peeks, the House gets notified and can get paid for it. The more windows Google adds to the House, the more chances the House gets paid.

Re:actually... by babbage · 2003-07-14 02:31 · Score: 2, Insightful

ADX is the ad server used by NYTimes.com, it has nothing to do with page content.

If what you're posting comes from an article page's <head> section, you seem to be pasting more than you intended. Directives to ban archiving of ads isn't an editorial issue, but a business decision -- cached ads screw up the bookkeeping and, by extension, the bottom line on the balance sheet.

The practice of restricting cacheing of ad content is, presumably, common across the industry -- it's not just NYT that has an interest in forcing this.

The (apparent) <meta name="ROBOTS" content="NOARCHIVE"> tag you cite should be wholly separate from the ad server code.

(Signed, a former employee of NYT digital...)

--
DO NOT LEAVE IT IS NOT REAL

Come on, it's not a SPAM question... Get real. by jbottero · 2003-07-14 03:05 · Score: 1, Insightful

But you know, you DON'T have to give a real name or email for NYT or JPost, or most of the others, they don't send you your pass and uid, you know.

It's not the spam that's the problem, if you use your head, you get no spam. It's the hassle of logging on

Quit your fucking whining. by Wakko+Warner · 2003-07-14 03:57 · Score: 1, Insightful

I have an NYT account. Do I care if they know what I read on their site? About as much as I care when the next "American Idol" rerun is on (which is to say, not at all.) Why on earth are you fuckers so paranoid about this? I see absolutely nothing wrong with tracking as long as it's limited to the originating site.

Get over it, for God's sake.

- A.P.

--
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"

Get the lawyers -- by jtalkington · 2003-07-14 04:24 · Score: 2, Insightful

people are using technology for what it was inteded to do!

At the heart of Google's caching dilemma lies a thorny legal problem involving a core Web technology: When is it acceptable to copy someone else's Web page, even temporarily?

When your server and pages say it's alright (or don't say that it's not alright.) The standards for the web are very clear on this, but non techie companies (and some judges) don't seem to get this.

This reminds me of the issues of "deep linking" that everybody was suing over a couple of years ago. That's exactly what the web was designed to do, but these johnny-come-lately companies put sites up, and expect people to stop using the technology for what it was designed for.

If only the EFF was as well funded as the ACLU...

Hey NY Times! I don't want to register! by mnemotronic · 2003-07-14 05:26 · Score: 3, Insightful

Shouldn't the NY Times simply tell Google not to cache their site?"

How about if the Times got over their registration fetish?

From the Times Subscriber Agreement:

You may not ... in any way exploit, any of the Content or the Service (including software) in whole or in part.

What is meant by "exploit"???

From the "Forums and Discussions" section:

You shall not upload to, or distribute or otherwise publish on the message boards (the "Forums") any ... abusive ... material.

What is meant by "abusive"???

And how about this>

3.5 You acknowledge that any submissions you make to the Service (e.g. Letter to the Editor, Review or Commentary) may be edited, removed, modified, published, transmitted, and displayed by NYTD and you waive any moral rights you may have in having the material altered or changed in a manner not agreeable to you.

Interpretation: The user/poster is entirely responsible for the content of their post, which the Times may alter in any way. Yikes!!! Granted, this applies only to content submitted to the Times, but the wording seems pretty scary.

--
The Russians have won. They have made the world a cesspool of distrust, greed, fear and hate.

It shouldn't be up to google.... by dentar · 2003-07-14 05:36 · Score: 2, Insightful

..to censor their cache. Those that don't want their content cached should fix their web servers and firewalls first. My web site prohibits known web crawler bots, and google doesn't cache it. No problem! I didn't have to harrass google about it and they don't have to break their own promise to not be evil.

--
-- I am. Therefore, I think!

Re:Google's cache copy - the larger issue by ahhhmytoes · 2003-07-14 06:23 · Score: 2, Insightful

On some file types, such as .txt files, there's no place to insert a "noarchive" and Google goes ahead and caches it anyway.

Try the Pragma: no-cache and/or Cache-control HTTP headers.

Re:Free registration..some implications by bobbozzo · 2003-07-14 10:59 · Score: 2, Insightful

Yeah, I always like to try abuse@domain for sites that require registration. Kinda mean to the postmaster, but if I "opt-out" and they still send something then they're spammers anyways.

--
Nothing to see here; Move along.

Slashdot Mirror

Web Caching: Google vs. The New York Times

49 of 518 comments (clear)