Slashdot Mirror


NY Times To Data-Mine Its Visitors

pilsner.urquell points out a story in the Village Voice from a stockholders' meeting at the New York Times. It seems that the media giant is now eager to data-mine visitors to its Web properties. Of course anybody with a site who profits from advertising is likely to be doing something of the sort. It's just a bit surprising that the Times would use the words "data mining" out loud in public. From the article: "Barely a year after their reporters won a Pulitzer prize for exposing data mining of ordinary citizens by a government spy agency, New York Times officials had some exciting news for stockholders last week: The Times company plans to do its own data mining of ordinary citizens, in the name of online profits... [T]he problem with reading papers electronically is that they can also read you."

24 of 98 comments (clear)

  1. Obligatory?? by Anonymous Coward · · Score: 5, Funny

    [T]he problem with reading papers electronically is that they can also read you.

    So, how are we supposed to make Soviet Union jokes after this??

    1. Re:Obligatory?? by goombah99 · · Score: 4, Interesting

      Wow an insightful pithy first post. I suppose that since I assume all commerical sites, especially free one, are data mining me and selling me out in anyway they can I'm not worried by this. In fact I think it shows a lot of integrity by the NY times to announce their intentions ahead of time as it can only be bad PR.

      --
      Some drink at the fountain of knowledge. Others just gargle.
    2. Re:Obligatory?? by MrNaz · · Score: 2, Funny

      I never knew Nietzsche was Russian!

      --
      I hate printers.
    3. Re:Obligatory?? by Intron · · Score: 3, Funny

      "shows a lot of integrity"

      Except they didn't put it in an announcement to the website visitors, they announced to their stockholders that they planned to make more money.

      Anyway, I think everyone visiting the NYT site from now on should do a search for "elephant porn" and we'll see how that affects their advertising budget.

      --
      Intron: the portion of DNA which expresses nothing useful.
  2. Hello Bug Me Not by fishdan · · Score: 4, Insightful

    OR some other similar service. When are sites going to learn that we CAN protect out privacy if the force us too. You catch more flies with honey...

    --
    Nothing great was ever achieved without enthusiasm
    1. Re:Hello Bug Me Not by NoTheory · · Score: 4, Insightful

      The problem is that this is a functional analysis. Even if they don't have your legitimate contact details, they know what you've been browsing, and if at some point they can attach it to your legitimate contact details, then boom they've got the whole shebang. This is a privacy unfriendly move. It makes it more difficult for you to maintain your anonymity. Services like bug-me-not are insufficient because it requires you to try out multiple contact details, and maintain a list of valid contact details (which can be made all the more difficult of the organization is active in closing these accounts).

      Even if you think people should be more privacy conscious, this is a bad move, that makes everyone less private. The irony of the situation is really the only thing that makes it notable. Stupid NYT.

      --
      There are lives at stake here!
  3. Papers read you! by maxwell+demon · · Score: 4, Funny

    "[T]he problem with reading papers electronically is that they can also read you."
    Wow, a Soviet Russia joke directly in the summary!

    --
    The Tao of math: The numbers you can count are not the real numbers.
  4. Data Mining and issue? by rodney+dill · · Score: 3, Interesting

    I'm not sure why there is such a concern over data mining. As long as the mining is done from public sources then I see no problem. If the mining is from medical records, government records that are sealed or presumed to be private, or some other protected database then is becomes an issue.

    --

    Use your head, can't you, use your head,
    You're on earth, there's no cure for that
    - S. Beckett
    1. Re:Data Mining and issue? by superbus1929 · · Score: 4, Insightful

      Where does it stop? Once you get comfortable with data mining, will you also have to get comfortable with more than just your IP attached? Will you be comfortable with someone having a full consumer database of John Doe, instead of just 10.10.10.220? Will you be comfortable with your profile being viewable to everyone that wants it? Will you be comfortable being positively unable to get away from Capitalism even for a second?

      I'm not trying to put on a tin foil hat by any means; if it was just "hey, so many people like Coke over Pepsi!", I'd be cool. But anything further than that, and I view it as a slippery slope.

      --
      Let's stop dilly-dallying and just change "-1: Overrated" to "-1: Disagree" or "-1: Doesn't Subscribe to Groupthink".
    2. Re:Data Mining and issue? by Anonymous Coward · · Score: 2, Informative

      Stealing cookies?

      Web bugs?

      Script injection/invisible framing?

      And the easiest - seeing which site you came from before you hit their server.

      If you think "data mining" is going to stop at "this IP read these pages at this date," you're a sucker.

    3. Re:Data Mining and issue? by noidentity · · Score: 3, Insightful

      There's a fundamental difference between a company doing demographics, and the government spying on citizens. The company doesn't care about any person in particular, just common trends, and simply changes how they design/market their products. At worst, it means they can more effectively sell you junk you don't need. The government's use of data is pretty much the opposite.

    4. Re:Data Mining and issue? by krbvroc1 · · Score: 2, Interesting

      To quote the RIAA, think of it as stealing. Basically, in addition to already viewing advertisements, websites want to steal my Intellectual Property. See, there is a value placed on my data by the market, and that data is being collected and securitized without compensating me and in many cases without my permission.

      Each of us own the Intellectual Property in our heads. Like the RIAA, we need to stick together and demand either payment or permission for this information.

  5. Garbage In, Garbage Out by LMacG · · Score: 4, Interesting

    I have a login for the NYT. According to the information I provided, I'm a female born in 1901, living in ZIP code 90210.

    (For the record, at least one of those data points is incorrect).

    --
    Slightly disreputable, albeit gregarious
    1. Re:Garbage In, Garbage Out by truthsearch · · Score: 3, Funny

      Gabrielle Carteris, is that you?

  6. So they used a scary phrase. by harks · · Score: 4, Insightful

    Data mining, she told the crowd, would be used "to determine hidden patterns of uses to our website."
    So they used a scary phrase, but there isn't anything nefarious about noticing that people who read articles on subject X might want to see a link to article Y.
  7. No news by VincenzoRomano · · Score: 3, Insightful

    Almost all websites do it!
    This is a reason why cookies are used and why almost all browsers provide mechansms to filter them out!

    --
    Maybe Computers will never be as intelligent as Humans.
    For sure they won't ever become so stupid. [VR-1988]
  8. oblig. by CrowbarKing · · Score: 2, Insightful

    1. Reveal data mining 2. Win Pulitzer prize 3. Start data mining 4. ??? 5. Profit!

    --
    If girls liked guys that were interested in them for their brains, they'd date zombies.
  9. At least they are being honest by paladinwannabe2 · · Score: 2, Insightful

    Pretty much every site does data mining- I'm sure /. keeps track of how many people click on ads, read the article (only 2 so far), etc. /. probably even ties all this information to your account, so they have a better idea of what ads to display. I don't even have a problem with any of that. Once they start selling my information to other people is where I have a problem. I don't mind /. targeting me with ads, but I do mind my email address being targeted with spam.

    --
    You are reading a copy of my copyrighted post.
  10. Re:No news - Still news... by haibijon · · Score: 2, Informative

    Even if you disable cookies, its trivial to pass a session id through the url to maintain a user's authenticated session. They'd still be able to determine which/what article you were reading and provide 'similar' links etc. Not to mention that most cookies are used to track and maintain user logins and server sessions, not to data mine... NYT is saying that they're explicitly going "to determine hidden patterns of uses to our website." using data mining, this isn't about Cookies, its about the tracking and monitoring of browsing habits.

  11. Its all in the terminology by Grashnak · · Score: 2, Interesting

    If they said, "We'll be tailoring our site to the visitors' interests, thereby enhancing their experience", no one would care, but once they say "data-mining", suddenly everyone is screaming "OMFG, the NYT is like the NSA! WTF? Remember the constitution dude!"

    --
    Life needs more saving throws.
  12. Re:Garbage in garbage out by Jonathan · · Score: 2, Informative

    I can assure you that "average" people *do* give out accurate information; when I tell my relatives that I generally just give random info, they tend to be shocked and say "But, but, that would be LYING".

  13. There is no free lunch by anoopjohn · · Score: 3, Insightful

    Recently there was this big debate on slashdot about google's purchase of doubleclick. Why would you care if your usage patterns are tracked by a company - without attaching it to your personal identity - and deliver targeted advertisements. There is no free lunch. You are paying for the free content by selling your usage patterns. They don't want to do it in any other way. You can leave it or take it. Perhaps at some point of time in the future there would be ad-free subscription based content. I doubt, though.

    I run a company and I face the same problem - How to reach the set of people who are most likely to be my customers. The more successfully I can do that, the lower would be my marketing cost, and the cheaper would the product be in the long run. Ultimately if we have a system where each person sees only those ads that he needs to see we would have a highly efficient marketing system with the lowest marketing costs. A reasonably big percentage of the cost of most products you buy are marketing costs. So if you would like them to be cheaper - stop complaining and start selling your usage data.

    There is only one issue here - privacy advocates have to ensure that there is no real breach of privacy in the process. If googlebot sees the mails i see there is no problem, but if googlebot reads my mail and checks against some preset filter and requests Mr X to take a look at my mail then it is a breach of privacy. As long as the identity is kept separate from the patterns there shouldnt be any problem

    --
    "Be the change you wish to see in the world" - M. K. Gandhi
  14. Re:No news - Still news... by RetroGeek · · Score: 2, Informative

    If you do this, then you only need to change some part of the string to a random value, then hit enter.

    When the page refreshes, then click on the link you want to read.

    Wash, rinse, repeat.

    Sure they are tracking something, but it will not be you.

    There are lots of ways to monkey with this sort of thing.

    --

    - - - - - - - - - - -
    I am a programmer. I am paid to produce syntax not grammar. Deal with it.
  15. Re:f you and your buzzwords by iminplaya · · Score: 2, Informative

    Well, he may be right. I don't know the official definition of "data mining" (and really don't care). By itself, it's not necessarily a bad thing. But like like GM foods, I want to see a label. For the moment, I assume that everybody is data mining, and will block it where I think it's appropriate. It's really nothing more than typical top down 19th century business practice and its attempt to stay alive. I would like to make a point of showing that it is in their interests to look for another way to conduct business that doesn't use personal information by making this one as unworkable and expensive as possible. The old methods no longer apply.

    --
    What?