Netflix Prize Sequel Cancelled Over Privacy Concerns
An anonymous reader writes "Netflix just announced that they have cancelled the sequel to the Netflix Prize, which was promised last year. Netflix made the choice after they were sued over privacy concerns. The prize involves releasing large amounts of data about users' movie preferences, which raised concerns from the Federal Trade Commission and a lawsuit from KamberLaw LLC. Netflix's Neil Hunt said, 'We have reached an understanding with the FTC and have settled the lawsuit with plaintiffs. The resolution to both matters involves certain parameters for how we use Netflix data in any future research programs.'"
From the linked to previous story...
If a data set reveals a person's ZIP code, birthdate and gender, there's an 87 percent chance that the person can be uniquely identified
Why does Netflix need to release something as precise as a birthday in order to make movie recommendations? I mean, TV ratings are done in demographic groups. Couldn't Netflix get away by just stating a birth year?
This is definitely a sad outcome to all of this. However, couldn't Netflix just update their EULA and/or have an opt-in for users who want to make the experience better?
I don't know if people are just paranoid or what, but they seem to be intent on protecting EVERYTHING nowadays. Next thing you know, people will get sued for asking whether you put the toilet paper roll facing away from the wall or towards it.
For the record, it's away from the wall, you savages.
Living With a Nerd
Maybe they are consulting the Zodiac astrology?
I bet there are thousands of guys out there scared to death that someone will find out they rented Twilight (for the girlfriend, honest!). I'd rather be known as a lawyer-happy jerk than a Twilight fan.
My webcomic
a whole lot more on the people to identify them from those three pieces of information.
As in, you would need to know which house has the account, then guess among all the netflix in that zipcode.
I suppose you could stand outside, but I doubt that there is 87% chance unless you know other supposedly private information.
I don't see how exact birthdate matters, how would you confirm it when looking at people? Birthdate x/y/z lives at 144 Amazing avenue? Really? and what gave that away?
* Winners compare their achievements to their goals, losers compare theirs to that of others.
Why don't they munge the privacy info, instead of saying a account belongs to John Smith, say it belongs to 1122113.
No one except netflix will know who exactly 1122113 is.
They're definitely being more careful privacy-wise now, almost to the point of paranoia in my opinion. The best example is a feature that was supposed to be released over 7 months ago - still sitting in development while they debate privacy issues.
The feature was supposed to allow developers to get all the titles a user has rated, along with the associated ratings. You can already see the user's rating for a particular title by querying that title, and you can also see all the titles a user has watched on Netflix, but for some reason they think that an API method to return all ratings is a separate privacy issue. (To access any of the previously mentioned data the user must explicitly sign into Netflix through your application, obviously)
This is rather unfortunate since it would be useful for many Netflix apps to have these ratings, since people may rate items they haven't watched/rented from Netflix.
My gf knows a lot about astrology, and believes it has an influences on your personality. A good recommendation engine could use this information along with everything else. Your birth year is significant, but it doesn't run 1 Jan to 25 Dec. I'm not sure if this is based on the Chinese calendar or if that's a separate data point... but in order to identify your sign and year, you need to know the exact date.
I don't lend astrology much credence, but if you're going to tell me it doesn't work I'd like some repeatable science on it, or at least download the dataset yourself and see if any patterns emerge before just saying it doesn't work. I haven't tried it myself, because I don't think I know enough about astrology to correlate significant data points. Of course, people lie about their birth date (I do, and at least one other in this thread does) so you're analyzing self-reported birth dates, not actual ones, making science cry.
Interesting tie-in to this story over on ReadWriteWeb where there's a lively discussion on whether or not social networks should be able to sell data about users.
http://www.readwriteweb.com/archives/myspace_bulk_data.php#comment-196403
"The Netflix Prize II cancellation is another example of why we need a lot more discussion around these issues. Here we have a great example (Netflix Prize I) of how the simple availability of data had a huge impact on the science and the business of computational/algorithmic recommendation and machine learning. It seems that for a tiny sum, $50K, Netflix and all the others who want to help create a world in which advertising and recommendation are helpful rather than an annoyance, could have continued this outstanding work with a bit of standard automatic data masking. Crazy!"