Anonymity of Netflix Prize Dataset Broken
KentuckyFC writes "The anonymity of the Netflix Prize dataset has been broken by a pair of computer scientists from the University of Texas, according to a report from the physics arXivblog. It turns out that an individual's set of ratings and the dates on which they were made are pretty unique, particularly if the ratings involve films outside the most popular 100 movies. So it's straightforward to find a match by comparing the anonymized data against publicly available ratings on the Internet Movie Database (IMDb) (abstract on the physics arxiv). The researchers used this method to find how individuals on the IMDb privately rated films on Netflix, in the process possibly working out their political affiliation, sexual preferences and a number of other personal details"
The researchers used this method to find how individuals on the IMDb privately rated films on Netflix, in the process possibly working out their political affiliation, sexual preferences and a number of other personal details"
This is a loaded statement. The most you can determine is that if a person likes movie A, B, C and D but hated E and F, there is a higher probability they are a guy. If they liked Z but didn't like X, there is a higher probability they might be a republican than not. You're still anonymous.
Unless, of course, you're one of the three people that liked "Glitter". Then I think they might have something on you.
Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
It doesn't sound like the anonymity of the prize set was broken through any fault of NetFlix. It sounds like some sampling of users made the mistake of rating movies on a site where the info is publicly available, and a site where it's not. All they did was correlate the two.
So the lesson is, basically, don't post stuff that you don't want to be public to a website that makes it public, right? This is sounds roughly like blaming the DMV for figuring out a car owners likely political leanings by the bumper stickers on their car.
"It is a miracle that curiosity survives formal education." -Albert Einstein
For those who haven't rated movies on IMDB, such as myself - and I imagine a large proportion of subscribers.
This is total hyperbole.
All they researchers are saying is that they can deduce some of your preferences based on your other preferences. Of COURSE you can do that, that was the whole point of the contest Netflix put up.
What they are _not_ saying is that they now know who you are, where you live, or anything uniquely identifying about you. So basically, you are still anonymous.
I'm starting to tire of news headlines that claim the world is on fire when someone actually just does something slightly derivative from the norm and thinks they are brilliant. The noise from these non-events mask actual brilliant achievements and make it seem that everyone is doing banal work.
As far as I know in IMDB you are rating the overall quality of the movie, not I agree with it OR I want to see more like this.
One example, Shindlers list, great movie, do NOT want to see it again. Same with Grave of the fireflies. Some movies just ain't for multiple viewings. They are my "favorite movies I never want to see again".
On the other hand I got movies I can watch any day of the week, but that I would NEVER rate as highly. Cannonbal run is one such movie. It watch it far too often, but I wouldn't call it a good movie. You can always fine me ready for a Jacky Chan movie or a spagethi western.
Is the netflix rating system a "I liked this movie and want to see more like it" system or a "This movie was brilliant and I would highly recommend it too everyone else" type of rating system?
Granted some people get it confused, probably the same people that use the slashdot moderation system to silence views they don't like, but that only makes basing conclusions on user ratings even more problematic.
I can rate a movie highly even if I do not agree with it, simply because it is good. And I can rate a movie I really like to watch as crap simply because I know I like watching crap.
I don't like the godfather movies, I can see they are high quality, I just don't like them. So my rating them would be fairly high as for quality, but low for 'I want to see more like this'.
I thought that the netflix system was "I want to see more like this" based. Surely nobody is so stupid as to think a quality rating and a "i like this" rating system are the same? Or am I completly in the wrong in seeing a difference between the two? Am I insane in thinking that you can see a movie as being a great artwork and still not liking it or viceversa?
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
Finding a paragraph like this in a research paper makes me call into question the motives and intentions of the 'researchers.' They seems sort of like the Jerry Springer of research (since he's just trying to help the families he has on his show...).
They imply that the person didn't like "Super Size Me" because he's probably fat (or are they trying to imply that he has a problem with gaining weight and is jealous?).
Also, they imply that because he rated two "predominantly gay theme" items as poor he must not be homosexual. Or are they implying that because he rented/rated these that he must be gay (because who would ever rent them otherwise).
The fact that they use the "there's more juicy stuff about this guy, but we can't tell because we're serious researchers" line at the end is the pièce de résistance that really shows what motivates these researchers.
Because it isn't a Credit Card # or SSN it isn't serious?
A) Some people would rather go to jail or commit suicide than admit to something embarrassing they'd rather keep private. Privacy isn't (just) about hiding (illegal) things from the Government.
B) Demographic information is something you can never take back and can never change.
At least I can get a new credit card & SSN.
[Fuck Beta]
o0t!
Is that any more surreal than a form of "entertainment" in which people get shot at or blown up every five minutes or so?