'Anonymized' Credit Card Data Not So Anonymous, MIT Study Shows
schwit1 writes Scientists showed they can identify you with more than 90 percent accuracy by looking at just four purchases, three if the price is included — and this is after companies "anonymized" the transaction records, saying they wiped away names and other personal details. The study out of MIT, published Thursday in the journal Science, examined three months of credit card records for 1.1 million people. "We are showing that the privacy we are told that we have isn't real," study co-author Alex "Sandy" Pentland of the Massachusetts Institute of Technology, said in an email.
As one who hot tired of high fees, I dropped the use of credit/debit cards. I used a gift card for an online purchase. Nothing annon about it. Has my name and address on the order.
The truth shall set you free!
...but the research couldn't explain why."
Afraid of the feminazis, Mr. Scientist?
When NSA collects 'metadata', it's disturbing but also difficult to see how they benefit from corrupt use of the data. But corporate 'big data' just has many ways to make money off of it. Where is the Snowden of Citibank?
Gently reply
Where is the link to the actual study?
Staff Sergeant Obvious reporting for duty.
...using a fingerprint database to show that cash isn't anonymous.
Any sufficiently unpopular but cohesive argument is indistinguishable from trolling.
Staff Sergeant Obvious reporting for duty.
Not so obvious actually. But then, I never realized before that anonymizing my data just means "replacing 'John Smith' with 'User 12345'". I always thought it was anonymized through aggregation.
When i make purchases with my credit card, i'm not worried about someone knowing it was me, Shadowrat, who made the purchase. When did people claim that you could anonymously buy anything with a credit card? Obviously that's stored in lots of places. I buy something online, the vendor needs to know where to ship it, my credit card company knows who to bill, amazon knows because they are passing the info on.
What i worry about is someone stealing my number. This Honestly, i don't even worry about that so much anymore since it's happened enough and i've come away completely unharmed, i'm just kind of numb to it.
It's easier to identify women, but the research couldn't explain why, de Montjoye said.
Could it be that men tend to shop a lot less than women!?
The article says it can identify someone in as few as 3 transactions.
But they aren't really identifying them, they are just showing that no other person hit the same exact set of shops.
Well, they also mention that they get a datestamp with the transaction so assuming that datestamp has minutes
or seconds then it should only take 1 transaction or 2 at the most. That being said, you really haven't identified
this person as you don't know who they are in the real world just that they have a unique shopping pattern as
everyone does.
I always thought it was anonymized through aggregation.
Aggregation is not very useful. Much more useful is being able to look for relationships between purchases by the same user. Years ago department stores would have an "accessories" section. Then Wal-Mart crunched their data, and figured out that people don't shop for accessories randomly. They buy a belt when they are buying pants. They buy a necktie when they are buying shirts. So today, the belts are placed by the pants, and the neckties are placed by the shirts. This seems kind of obvious in hindsight, but it took data analysis to make it happen.
If a woman stops buying condoms and starts buying vitamin supplements, that means you should showing her popup ads for maternity clothes. Nine months later, you can show her a different brand of condom, with ads than emphasize reliability.
This isn't actually privacy, and it's sad that people aren't clearer about what is and isn't privacy.
Though still a bit troubling.
Spafford, who wasn't part of the study, said it makes "one wonder what our expectation of privacy should be anymore."
Privacy can't be monetized and retailers can't profit from privacy so therefore we know how much privacy we have; it's the small fraction left after they collect everything useful. This will continue this way until we have laws that make data retention and privacy violation such a legal liability hot potato that businesses will be tripping over themselves to delete data and avoid unnecessary collection and retention.
I don't know about you, but I think it's pretty fair to say that a record without any information directly identifying the subject is "anonymous".
The ability to complete an analysis of multiple records and data sources thereby reasonable guess (90% accuracy) of who the subject might be is insufficient to remove the title of anonymous.
For loose definitions of "identify" they could find sets of credit card transactions that would meet the given "pieces" of information. If Detective Paul Drake is looking for someone who went to a particular restaurant one night and then bought cake from some bakery next day, and Della Street knows the same person paid for toll the same evening, the super duper algorithm will tell Perry Mason all the sets of transactions that would match the given "pieces". But the data sets will not have any name or address attached to it. But still Ham Burger will make a mistake and his star witness will confess on the stand.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Damn! You were demoted?
“He’s not deformed, he’s just drunk!”
Can't these articles link to the journal's entry for the paper. This is of professional interest to me and I'd like to read the abstract at least, maybe even purchase the damn thing.
The fact that someone calls him/her "Sandy" isn't useful information to me since we're not going to hang out and shoot the shit. Trim useless information from the summary.
I think her child's needs will consume more of the family budget. The adverts will concentrate on selling goods satisfying that need.
Maybe it was a planned pregnancy. Women go from the pill to condoms so they can choose the month of birth, or so they know when they're ovulation cycle is regular.
They did NOT show that, from 3-4 transactions, they could provide your name, address and phone number, or even that if you have 3-4 transactions in a million transaction anonymized data set they can find out anything about you personally *unless they know you first*.
What they did is show that if they know that you, personally, had 3 to 4 types of transactions on specific dates (you went to a grocery store and a gas station today, and a restaurant yesterday), they could identify which anonymized data set you belong to. Their discovery requires specific outside knowledge not contained in the data.
This only matters if, say, a third party could identify specific purchases and dates - they could then comb the records and find the rest of your transactions on that specific card. IOW, someone has to be looking for you, and know at least something about you, to even start the search.
Is it just my observation, or are there way too many stupid people in the world?
I read the article but I did not find what was left. In Belgium (Perhaps Europe) all that remains is the transaction number and the last 4 numbers of the card. The card company will only see the amount and will have no idea what is bought.
So if the last 4 digits are 1234 (And about 1 in 10000 will scream) they know if I pump gas, take out some cash, eat in a restaurant and buy at a supermarket that they know who I am?
I would really, really, really try to test that claim.
I assume some other data has been left.
Don't fight for your country, if your country does not fight for you.
Of course it's not bloody real.
For us to believe this data has been 'anonymized', we have to assume that a) the company is qualified to do what is required to anonymize the data, b) that they actually give a shit, and c) that they bear any penalty if they do a terrible job.
Entrusting these companies with this data in the first place is the problem. Allowing them to share it all over the place for profit and with no restriction is a terrible idea.
This is precisely why sane countries have data protection and privacy laws -- because corporations are greedy, self serving entities, who won't give a crap if the collateral damage of their stuff is to damage the privacy of everybody they deal with.
And this is precisely why all of those analytics companies in web pages are just parasites and not to be trusted.
Lost at C:>. Found at C.
From what I can tell, they first need to know the identity of the individual who made those 3 particular purchases. From that, they can link the individual to the entire set of his/her purchases in the "anonymized" CC data.
I'm very concerned about privacy issues, but this doesn't really surprise or disturb me. It would be quite a coincidence for another person to engage in transactions at the same three places I did and at approximately the same times.
Take a similar example; all that thoroughly scrubbed medical history that is sold wholesale these days. You think the 40 year old female with sickle cell and a glioma removal isn't identifiable? For instance.