NY Times To Data-Mine Its Visitors
pilsner.urquell points out a story in the Village Voice from a stockholders' meeting at the New York Times. It seems that the media giant is now eager to data-mine visitors to its Web properties. Of course anybody with a site who profits from advertising is likely to be doing something of the sort. It's just a bit surprising that the Times would use the words "data mining" out loud in public. From the article: "Barely a year after their reporters won a Pulitzer prize for exposing data mining of ordinary citizens by a government spy agency, New York Times officials had some exciting news for stockholders last week: The Times company plans to do its own data mining of ordinary citizens, in the name of online profits... [T]he problem with reading papers electronically is that they can also read you."
[T]he problem with reading papers electronically is that they can also read you.
So, how are we supposed to make Soviet Union jokes after this??
OR some other similar service. When are sites going to learn that we CAN protect out privacy if the force us too. You catch more flies with honey...
Nothing great was ever achieved without enthusiasm
"[T]he problem with reading papers electronically is that they can also read you."
Wow, a Soviet Russia joke directly in the summary!
The Tao of math: The numbers you can count are not the real numbers.
I'm not sure why there is such a concern over data mining. As long as the mining is done from public sources then I see no problem. If the mining is from medical records, government records that are sealed or presumed to be private, or some other protected database then is becomes an issue.
Use your head, can't you, use your head,
You're on earth, there's no cure for that - S. Beckett
In Soviet America, the paper reads you!
I have a login for the NYT. According to the information I provided, I'm a female born in 1901, living in ZIP code 90210.
(For the record, at least one of those data points is incorrect).
Slightly disreputable, albeit gregarious
That's fine, I pretend not to be the Googlebot... Thus getting in without having to register. When I have to register, I of course fake my information.
Doesn't everyone do that?
I wank in the shower.
Almost all websites do it!
This is a reason why cookies are used and why almost all browsers provide mechansms to filter them out!
Maybe Computers will never be as intelligent as Humans.
For sure they won't ever become so stupid. [VR-1988]
You mean like: "In Soviet Russia newspapers read you!!"
[Insert pithy quote here]
I see only two such posts in the discussion so far and this one is older by three minutes. It is also the better of the two. I'm surprised there are only two...
This is one of the interesting catches of life online. In order to purchase full access (as opposed to open registration) for content for NYT online, you must suppy financial data in the form of a credit card. (PayPal not accepted. ) This means that NYT Online is able to match your browsing habits to all of the financial data on file for you.
Although PayPal does provide some anonymity, it only officially guarantees goods sent to a real world address, thus losing full anonymity for purchases. Purchased credit cards also require personal data.
If you really want to prevent data mining of your personal habits, pay cash, in person. For now, there is no true anonymity online.
There is nothing nefarious or evil about the phrase "data mining". It's just trolls who try to attach some sort of negative connotation to it.
I don't need no instructions to know how to rock!!!!
1. Reveal data mining 2. Win Pulitzer prize 3. Start data mining 4. ??? 5. Profit!
If girls liked guys that were interested in them for their brains, they'd date zombies.
Pretty much every site does data mining- I'm sure /. keeps track of how many people click on ads, read the article (only 2 so far), etc. /. probably even ties all this information to your account, so they have a better idea of what ads to display. I don't even have a problem with any of that. Once they start selling my information to other people is where I have a problem. I don't mind /. targeting me with ads, but I do mind my email address being targeted with spam.
You are reading a copy of my copyrighted post.
I can't believe the NYT still requires people to make up random personal data to read their newspaper. Seriously, has anyone here ever given out real information when registering an account with the NYT? Even without services like bugmenot, the information they'll get from datamining their visitors will be too full of noise to be of any use.
I just read an article in the economist, which was mostly about Murdoch trying to buy the Dow Jones, which owns the Wall Street Journal. But the economist implied that the same could happen to the NYT, because that newspaper is also badly run. I can believe that. While I enjoy reading their newspaper if I can get my hands on a copy, they still have a very poor understanding of how the internet works.
Even if you disable cookies, its trivial to pass a session id through the url to maintain a user's authenticated session. They'd still be able to determine which/what article you were reading and provide 'similar' links etc. Not to mention that most cookies are used to track and maintain user logins and server sessions, not to data mine... NYT is saying that they're explicitly going "to determine hidden patterns of uses to our website." using data mining, this isn't about Cookies, its about the tracking and monitoring of browsing habits.
Between the government, with its vast powers, and a commercial endeavor making a buck from readers reading at no cost. Particularly since option out of the NYT's data mining is as easy as not visiting the site, while staying out of the governments data warehouse would probably take something like being unborn.
They don't let you take your guns onto the website.
Pussy ass liberals.
Almost everyone use cookies. And in the NYT website I don't see such IDs in the URLs!
Maybe Computers will never be as intelligent as Humans.
For sure they won't ever become so stupid. [VR-1988]
FROM: IT Data Mining Project
TO: Marketing
RE: VIP!
Just a quick preliminary result that is too important to wait for the offical report.
One of our readers, 'Anonymous7' is virtually a demographic by him/herself.
Uses the internet from all over the world, on thousands of machines, reading our paper hundreds of times a day!
Surely this person must have a major impact on data processing purchases worldwide.
Surprising though, the person seems naive, computer security wise, because their password is the same as their user name.
Leverage our connection to this VIP with our advertisers at once!
-- 3 events that reshaped the world in the 20th century: WW1, WW2, and WWW
In Soviet Russia, papers read you? Na
Really? It'd be nice if you read the article and checked your 'facts'. The article statest that NYT is GOING to be data mining, not that they've begun already. And just to check on the session handling I just disabled cookies and tried to view an article on the site, and sure enough the query string contains OQ=_rQ3D5Q26hpQ26orefQ3DsloginQ26orefQ3DsloginQ26o refQ3DsloginQ26orefQ3Dslogin&REFUSE_COOKIE_ERROR=S HOW_ERROR
e fQ3DsloginQ26orefQ3Dslogin' appears to be a session hash to tell them where you came from, and where you're going.
Indicative of both a requirement of cookies (REFUSE_COOKIE_ERROR=SHOW_ERROR) for the use of session handling, and '_rQ3D5Q26hpQ26orefQ3DsloginQ26orefQ3DsloginQ26or
Just because the url doesn't have a session_id key in it doesn't mean there's no session or server data being passed through the URL.
If they said, "We'll be tailoring our site to the visitors' interests, thereby enhancing their experience", no one would care, but once they say "data-mining", suddenly everyone is screaming "OMFG, the NYT is like the NSA! WTF? Remember the constitution dude!"
Life needs more saving throws.
Dear Former Subscriber:
Hey -- Mister "I Don't Need The Newspaper Any More, I've Got A Computer" -- remember us? Yep, it's your old pal, the New York Times. The one that you used to welcome to your house every morning before you bought a goddamn modem from goddamn Best Buy. The one you threw in the trash after you got your goddamn flashy high speed connection. Does that refresh your memory?
Well guess what. We still have something that you can't get at your precious internet: investigative reporting. Did you know our reporters recently obtained a secret database containing a list of over 20,000 clients of the notorious "Quint State Madame" Destinee Hills?
And did you know what phone number we found on that list? 746-555-7314. Does it ring any bells? It should. It's your cell phone.
So may I suggest renewing with the New York Times today? For the low, low price of $499.95 per month you'll get money saving coupons and our promise that this information doesn't fall into the wrong hands.
Yours in News,
Muck Raker, Acting Publisher
668: Neighbour of the Beast
Recently there was this big debate on slashdot about google's purchase of doubleclick. Why would you care if your usage patterns are tracked by a company - without attaching it to your personal identity - and deliver targeted advertisements. There is no free lunch. You are paying for the free content by selling your usage patterns. They don't want to do it in any other way. You can leave it or take it. Perhaps at some point of time in the future there would be ad-free subscription based content. I doubt, though.
I run a company and I face the same problem - How to reach the set of people who are most likely to be my customers. The more successfully I can do that, the lower would be my marketing cost, and the cheaper would the product be in the long run. Ultimately if we have a system where each person sees only those ads that he needs to see we would have a highly efficient marketing system with the lowest marketing costs. A reasonably big percentage of the cost of most products you buy are marketing costs. So if you would like them to be cheaper - stop complaining and start selling your usage data.
There is only one issue here - privacy advocates have to ensure that there is no real breach of privacy in the process. If googlebot sees the mails i see there is no problem, but if googlebot reads my mail and checks against some preset filter and requests Mr X to take a look at my mail then it is a breach of privacy. As long as the identity is kept separate from the patterns there shouldnt be any problem
"Be the change you wish to see in the world" - M. K. Gandhi
Doesn't seem too bad, but if it gets ugly I guess we will only start to use browser side scripting to create some random behavior in the background while we are reading. Some random noise would easily drown out our real browsing behavior (oh, he opened all these 10 articles at the same time, wonder which one he actually read?).
If you do this, then you only need to change some part of the string to a random value, then hit enter.
When the page refreshes, then click on the link you want to read.
Wash, rinse, repeat.
Sure they are tracking something, but it will not be you.
There are lots of ways to monkey with this sort of thing.
- - - - - - - - - - -
I am a programmer. I am paid to produce syntax not grammar. Deal with it.
For sites like the NY Times, I use BugMeNot.com and use someone else's login. After all, isn't recycling a perfectly good login better than getting a new one?
In other news, Microsoft Windows users are now covered under the Americans with Disabilties Act...
Definately true. Just trying to make sure that people realize that simply disabling cookies makes you untrackable, and in many cases cookies are required.
Edit: The above comment should read: "simply disabling cookies *DOES NOT* makes you untrackable"
Sweet reporting justice for SWIFT.......
I tild them everything they need to know about myself:
My name is Willie Horton.
I was born 12 August 1951.
My address is:
Maryland House of Corrections Annex
PO Box 534
Jessup, MD 20794-0534
E-mail: willie.horton@mail.com
They may even realize that I'm the same Willie Horton whose image was used to defeat Mike Dukakis in the 1988 Presidential election... but I guess that's the price of fame.
Being British, 90210 was for many years the only valid ZIP code I could reliably remember.
For UK sites, I would use MI6's Vauxhall Cross postal address and telephone number.
Just another reason to use Bug Me Not.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
I guess there is no need to point out the hypocracy on this one. I'm not going to hold my breath to wait for the liberal left's excuses as to why it's okay for "US", but not "THEM". This is why I'm so sick of politics.
There are 10 types of people in the world: Those that know Binary and those who don't.
Just don't read the new york times or use a secure web browser that
doesn't leave any cookies or electronic paper trails. Simple as that.
Correction: they announced to their stockholders that they planned to lose less money. The NYT has been bleeding red ink for years. This move on their part reeks of desperation to me.
Just use one of these bugmenot accounts.
Bottom line would seem to be that if it were not for techs doing this, the marketers would never be able to do this kind of spying.
So let's put a lot of the blame for these unscrupulous actions right on our heads, the techs who both make it possible and do the actual spying. If we didn't tell marketing that we could do this for them and we didn't do the actual mining work, this would not be as likely to happen.
Additionally, why do we, tech savy critters, even add the potentially invasive features to browsers in the first place??
It's just plain as day that the "features" we see being integrated lead to both more invasive tracking and also to more system security problems. Do we really need a pdf reader to activate programs and exceute embedded code? Certainly not. But that is part of what we are getting these days.
So why not recruit the tech world to stop the creation of such invsasive technology uses. If we don't code it for the marketers, they certainly can't do it themselves. Accept responsibility and refuse to write bad un called for invasive code. !!
Stand up, take responsibility, write responsible code....
I use the NYTimes.com login for Ken, one of my best friends ever. Not only were his details accurate (Ken was a marketing guy), but his username and password were the same for like... everything.
Ken also passed away over six years ago. His login still works. Someday I'd like to see what happens when they finally put marketing data together for him.
"Registration Required" pages displayed: 4,345,543,938
Other pages displayed: 1,117
This is what I overheard at that meeting: "Who knew that dead guys clicked on more Toyota SUV ads than anything else? Let's try posting some of those above-the-urinal ads in the casket lids, see if we can get the count up for the Subaru market, too."
John