Slashdot Mirror


'Anonymized' Credit Card Data Not So Anonymous, MIT Study Shows

schwit1 writes Scientists showed they can identify you with more than 90 percent accuracy by looking at just four purchases, three if the price is included — and this is after companies "anonymized" the transaction records, saying they wiped away names and other personal details. The study out of MIT, published Thursday in the journal Science, examined three months of credit card records for 1.1 million people. "We are showing that the privacy we are told that we have isn't real," study co-author Alex "Sandy" Pentland of the Massachusetts Institute of Technology, said in an email.

8 of 96 comments (clear)

  1. "the privacy we are told that we have isn't real." by turkeydance · · Score: 4, Funny

    Staff Sergeant Obvious reporting for duty.

  2. Re:Study by Anonymous Coward · · Score: 4, Informative

    http://www.sciencemag.org/content/347/6221/468.full?intcmp=collection-privacy

    The published article the clickbait was based on has much better information. For instance: the transactions for a person all still shared a unique ID#. "All that remained were the metadata: amounts spent, shop type—restaurant, gym, or grocery store, for example—and a code representing each person."

    If you don't cycle the code per person regularly of course correlation attacks will always work.

  3. Re:Regular users only by jbgroup1 · · Score: 3, Insightful

    If you don't count my student loans, I'm well off and have plenty of money.

    Of course, "Outside of the killings, DC has one of the lowest crime rates in the country"--Marion S. Barry Jr., 1989

  4. Re:Why even 3? by Courageous · · Score: 3, Insightful

    This article isn't scary. What should be scary is that cell companies cell anonymitized _geolocation_ data. That data can be used to deterimine: A) who you are, B) where you live, C) where you work, and D) who your friends are. Step #1. Look where the phone is, regularly at midnight. Step #2, cross reference with public records databases on property ownership. That get's 65% of Americans right there. Now check where it parks every day at noon. Place of work found. And so forth.

  5. Re:"the privacy we are told that we have isn't rea by ShanghaiBill · · Score: 3, Funny

    I always thought it was anonymized through aggregation.

    Aggregation is not very useful. Much more useful is being able to look for relationships between purchases by the same user. Years ago department stores would have an "accessories" section. Then Wal-Mart crunched their data, and figured out that people don't shop for accessories randomly. They buy a belt when they are buying pants. They buy a necktie when they are buying shirts. So today, the belts are placed by the pants, and the neckties are placed by the shirts. This seems kind of obvious in hindsight, but it took data analysis to make it happen.

    If a woman stops buying condoms and starts buying vitamin supplements, that means you should showing her popup ads for maternity clothes. Nine months later, you can show her a different brand of condom, with ads than emphasize reliability.

  6. Re:Why even 3? by Not_Wiggins · · Score: 4, Informative

    The article is misleading. It talks about how it can be used to "identify someone." And with all the talk about privacy, it implies the identification of an individual.

    But, reading through it closely, they aren't talking about identifying a specific someone; the information isn't enough to say Not_Wiggins made these purchases.
    Instead, it focuses on identifying characteristics of purchasers and then extending it to see what other behavior purchasers in those groups would make.

    In the article example, they talked about someone making a purchase at both a bakery and a restaurant within a short time period. Finding that they had one such instance, named him Scott, then looked to see what other behaviors "Scott" had. By extending that logic, they are saying "look at the group of people who typically shop at a bakery and a restaurant... then you know those people are typically also interested in shoes."

    The example is a bit silly, but that's what they're saying.

    They're talking about documenting patterns of behavior on purchasing decisions.
    This article really isn't about loss of anonymity. It is about using anonymized credit card transactions to develop definitions of "user groups" and predicting their shared behavior pattern.

    To me, it seems more like the equivalent of last.fm... tell us what music you like, we'll compare it against what others who also have the same "likes" have said, and give you options for things that might fit your tastes.

    In this instance, it is: tell us what purchases you've made, we'll compare it against similar purchases that others have made, and we can predict what other purchases you might want/like that you haven't made yet.

    --
    Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
  7. Re:Regular users only by mjwx · · Score: 4, Insightful

    Not sure what you're talking about. My credit card has no fees

    It has no fees you know about... And banks want to keep it that way. When you pay for something by credit card, the merchant pays 3% or more for accepting the card. This means they have to pass the cost onto you in the form of higher prices.

    You didn't think the bank gave you free money did you?

    Its Machiavellian in its brilliance, you're robbing yourself of 3% in order to give yourself 1% and you're so enamoured with it, you're trying to do this as much as possible.

    --
    Calling someone a "hater" only means you can not rationally rebut their argument.
  8. Re:Why even 3? by NicBenjamin · · Score: 3, Informative

    And this only works if you have a lot of other data in your data set. If you don't know who Scot is, then you can't figure out he's the only person who could go to the bakery on that one exact day and that particular restaurant the next.

    I don't think anyone is particularly sanguine about the future of privacy if big companies manage to figure out a way to profit from combining their multiple massive databases. This is particularly true in the US, where it would be virtually impossible to stop the police from using said databases with our warrants. Or worse, using info that the big companies forwarded them as the basis for warrants.

    If Apple or Google can silence one of it's critics by figuring out he was paying a hooker with his supposedly anonymous Mastercard gift card, that is a really fucking bad thing.