NY Times To Data-Mine Its Visitors
pilsner.urquell points out a story in the Village Voice from a stockholders' meeting at the New York Times. It seems that the media giant is now eager to data-mine visitors to its Web properties. Of course anybody with a site who profits from advertising is likely to be doing something of the sort. It's just a bit surprising that the Times would use the words "data mining" out loud in public. From the article: "Barely a year after their reporters won a Pulitzer prize for exposing data mining of ordinary citizens by a government spy agency, New York Times officials had some exciting news for stockholders last week: The Times company plans to do its own data mining of ordinary citizens, in the name of online profits... [T]he problem with reading papers electronically is that they can also read you."
[T]he problem with reading papers electronically is that they can also read you.
So, how are we supposed to make Soviet Union jokes after this??
OR some other similar service. When are sites going to learn that we CAN protect out privacy if the force us too. You catch more flies with honey...
Nothing great was ever achieved without enthusiasm
"[T]he problem with reading papers electronically is that they can also read you."
Wow, a Soviet Russia joke directly in the summary!
The Tao of math: The numbers you can count are not the real numbers.
I'm not sure why there is such a concern over data mining. As long as the mining is done from public sources then I see no problem. If the mining is from medical records, government records that are sealed or presumed to be private, or some other protected database then is becomes an issue.
Use your head, can't you, use your head,
You're on earth, there's no cure for that - S. Beckett
I have a login for the NYT. According to the information I provided, I'm a female born in 1901, living in ZIP code 90210.
(For the record, at least one of those data points is incorrect).
Slightly disreputable, albeit gregarious
Almost all websites do it!
This is a reason why cookies are used and why almost all browsers provide mechansms to filter them out!
Maybe Computers will never be as intelligent as Humans.
For sure they won't ever become so stupid. [VR-1988]
1. Reveal data mining 2. Win Pulitzer prize 3. Start data mining 4. ??? 5. Profit!
If girls liked guys that were interested in them for their brains, they'd date zombies.
Pretty much every site does data mining- I'm sure /. keeps track of how many people click on ads, read the article (only 2 so far), etc. /. probably even ties all this information to your account, so they have a better idea of what ads to display. I don't even have a problem with any of that. Once they start selling my information to other people is where I have a problem. I don't mind /. targeting me with ads, but I do mind my email address being targeted with spam.
You are reading a copy of my copyrighted post.
Even if you disable cookies, its trivial to pass a session id through the url to maintain a user's authenticated session. They'd still be able to determine which/what article you were reading and provide 'similar' links etc. Not to mention that most cookies are used to track and maintain user logins and server sessions, not to data mine... NYT is saying that they're explicitly going "to determine hidden patterns of uses to our website." using data mining, this isn't about Cookies, its about the tracking and monitoring of browsing habits.
If they said, "We'll be tailoring our site to the visitors' interests, thereby enhancing their experience", no one would care, but once they say "data-mining", suddenly everyone is screaming "OMFG, the NYT is like the NSA! WTF? Remember the constitution dude!"
Life needs more saving throws.
I can assure you that "average" people *do* give out accurate information; when I tell my relatives that I generally just give random info, they tend to be shocked and say "But, but, that would be LYING".
Recently there was this big debate on slashdot about google's purchase of doubleclick. Why would you care if your usage patterns are tracked by a company - without attaching it to your personal identity - and deliver targeted advertisements. There is no free lunch. You are paying for the free content by selling your usage patterns. They don't want to do it in any other way. You can leave it or take it. Perhaps at some point of time in the future there would be ad-free subscription based content. I doubt, though.
I run a company and I face the same problem - How to reach the set of people who are most likely to be my customers. The more successfully I can do that, the lower would be my marketing cost, and the cheaper would the product be in the long run. Ultimately if we have a system where each person sees only those ads that he needs to see we would have a highly efficient marketing system with the lowest marketing costs. A reasonably big percentage of the cost of most products you buy are marketing costs. So if you would like them to be cheaper - stop complaining and start selling your usage data.
There is only one issue here - privacy advocates have to ensure that there is no real breach of privacy in the process. If googlebot sees the mails i see there is no problem, but if googlebot reads my mail and checks against some preset filter and requests Mr X to take a look at my mail then it is a breach of privacy. As long as the identity is kept separate from the patterns there shouldnt be any problem
"Be the change you wish to see in the world" - M. K. Gandhi
If you do this, then you only need to change some part of the string to a random value, then hit enter.
When the page refreshes, then click on the link you want to read.
Wash, rinse, repeat.
Sure they are tracking something, but it will not be you.
There are lots of ways to monkey with this sort of thing.
- - - - - - - - - - -
I am a programmer. I am paid to produce syntax not grammar. Deal with it.
Well, he may be right. I don't know the official definition of "data mining" (and really don't care). By itself, it's not necessarily a bad thing. But like like GM foods, I want to see a label. For the moment, I assume that everybody is data mining, and will block it where I think it's appropriate. It's really nothing more than typical top down 19th century business practice and its attempt to stay alive. I would like to make a point of showing that it is in their interests to look for another way to conduct business that doesn't use personal information by making this one as unworkable and expensive as possible. The old methods no longer apply.
What?