Netflix Sued For Privacy Invasion
We've discussed the Netflix Prize numerous times as the contest ran, including the news two years ago that the anonymity of the dataset had been broken. Now reader azoblue sends in this excerpt from Wired: "An in-the-closet lesbian mother is suing Netflix for privacy invasion, alleging the movie rental company made it possible for her to be outed when it disclosed insufficiently anonymous information about nearly half-a-million customers as part of its $1 million contest to improve its recommendation system. ... The lead attorney on the new suit, Joseph Malley, recently reached a multimillion-dollar settlement with Facebook over its failed Beacon program, which drew fire in part for sharing users’ Blockbuster rentals with their friends. ... If a data set reveals a person's ZIP code, birthdate and gender, there's an 87 percent chance that the person can be uniquely identified." The suit turns on the question of whether Netflix should have known that their dataset's anonymity could be broken, two years before researchers demonstrated that.
How large an area is a zip code in the states? I think in the UK if a company publicly released sensitive data about a people with their birthday and postcode attached there'd be outrage. Muppets.
I don't recall handing over my birthdate when I signed up for my account. I just went through all of the account screens and couldn't find it either. What part of their service expects you to tell them your birthday?
I want peace on earth and goodwill toward man.
We are the United States Government! We don't do that sort of thing.
The entire birthday? Holy crap! What did they expect?! Even just narrowing it down to birth year gives you a way to narrow the set considerably when combined with the other two items. What was wrong with the traditional "18-24, 25-40, etc." age ranges?
Learning HOW to think is more important than learning WHAT to think.
... this woman is a closeted lesbian. She came to the realization that, if someone hypothetical person were to come along and get into the NetFlix user data system, he could find out she's a lesbian. In order to protect herself from being potentially exposed, she decided to join a high-profile national lawsuit, charging that they had created a potential for people to find out her sexual preferences. How many days do you think it'll be before her picture is all over the web, sitting right next to the headline "formerly closeted lesbian pulled out of closet by attaching her name and face to a privacy lawsuit"?
How can a legal-aged adult file as Jane Doe just because of her secret of being 'in the closet'?
"The member’s movie data exposes a Netflix member’s personal interest and/or struggles with various highly personal issues, including sexuality, mental illness, recovery from alcoholism, and victimization from incest, physical abuse, domestic violence, adultery, and rape."
Isn't this a bit of a stretch. I've rented a rather broad range of films, over the past year some of the films I have watched include Apt Pupil, Lords of Dogtown, Girl Interrupted, A History of violence, A Beautiful Mind, Brokeback Mountain and Super High Me. Evidently I'm a mentally disturbed,abusive, homosexual, drug abusing, skateboarding, autistic nazi and didn't know it.
The woman who was outed wasn't outed by her movie choices but by her paranoia leading to her own disclosure.
If a data set reveals a person's ZIP code, birthdate and gender, there's an 87 percent chance that the person can be uniquely identified
What idiot answers all those questions correctly?
Oh wait. What if you do?
When some government agencies give statistical reports, they are very careful to suppress statistics that could lead to disclosure.
For example, in school accountability ratings and test results, if fewer than a certain number of students in a given grouping take the test, the average test scores for that grouping are suppressed. If I'm a parent of one of 2 White, non-Hispanic 3rd graders in a school and I know my Little Johnny scored a 73 on his Science standardized test, and I find out the average was a 60, and I know who the other White, non-Hispanic 3rd grader is, I now know his score. Oops.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
She signed up for this netflix thing, then found out some actual researchers, not hypothetical ones, cracked the publicly available data for a couple of users. She then joined a class action suit but didn't use her real name.
So while before all that was available was a list of rented films which she seems to think indicates that whoever rents them can indicate that the watcher is gay (which I'm having a hard time making the leap from "if someone watches movie X, Y, and Z, that means they are gay), now the whole world knows she is gay.
"I'm not sure I like the fugnutish tone you used in your post!" -RogL (608926)-
Is she hot?
"The average reporter we talk to is 27 years old......They literally know nothing." - Ben Rhodes
I watch movies with women gettin' it on together all the time. I guess my wife is in for quite a surprise when she finds out Netflix researchers discover she's a lesbian.
The way that I thought that it worked was that you sue in civil court when you actuall suffer damages even when the other party was doing something illegal.
For instance, you can't sue a drunk driver for almost hitting your car. You could press that they did something illegal and have him charged in criminal court, but there's no payday in that. Given that these types of cases seem to be this lawyer's modus operandi, I'm thinking that this case is more about the payday and not about building stronger standards for privacy.
Why are you giving Netflix your birthdate and gender in the first place? I never give those things to companies, and if I can't avoid it (forced to enter something when signing up) I give bogus information. Neither of those are any of Netflix's business.
Cory Doctorow talking about cloud computing makes as much sense as George W Bush talking about electrical engineering.
Stop watching gay movies. This will prevent people from learning that you are... gay.
then how was she outed?
Speaking as a gay guy with a lot of gay and lesbian friends, I can tell you that some people get really worked up over being "in the closet". They can start to worry about really stupid things that are outside of the bounds of possibility, and work themselves into all kinds of trouble.
Case in point: a friend of mine got herself fired over this. She knew that her supervisor didn't like gay people and so she was in the closet, as far as work was concerned. She got called up for jury duty. The court case didn't last long at all, but in the meantime, one of our mutual friends' father passed away. So, my friend was invited to the funeral which happened to fall on the day after her jury duty ended. She was so worked up over the idea that her boss would figure out that she's a lesbian if she took a personal day to go to her gay friend's dad's funeral that she lied and told her boss that she was still on jury duty for the day of the funeral. Well, the boss didn't like her and he called the court clerk to confirm that she was still on jury duty - and then fired her for lying about it.
Had she just took a personal day and said "I'm going to the funeral of a friend's dad" nothing would have happened. As far as I know, there's no mechanism by which you can figure out if the relatives of a dead person (whose name you don't have) are gay or not.
Maybe this lawsuit lady should read up on the Streisand Effect (you know her name's going to come out eventually), stop worrying so much about what other people think about her sexual orientation, and concentrate on living her life. Can she truly be deluded enough to think that anyone in her life (work, social, government or otherwise) is going to trawl netflix's database to figure out if she's a lesbian and then use that information against her?
Seriously, this is like when my boss didn't want to have his pay directly deposited because he thought the payroll company could snoop in his bank account. It's just not grounded in reality.
Putting moderation advice in your
99.999% chance of AC being Bart Simpson.
This case shows the ridiculous extremes that "privacy" has come to. Netflix, apparently, has some sort of affirmative obligation to help this woman hide her illicit sexual escapades. The government is going to require Netflix to help cover up for her proclivities.
Lesbian romps are voluntary. Using Netflix is voluntary. Telling Netflix about yourself is voluntary. Netflix voluntarily rents you videos. Every aspect of this case involves people freely engaging in voluntary action. And now we're being asked to get the government involved to force Netflix to hide information against their will, and, by the way, hide it retroactive to several years ago.
Why shouldn't we just say no to people like this? No, we won't help you hide. No, we won't force other people (against their will) to help you hide. No. If you want to hide the things you do, try being more discreet next time.
I completely agree, even then... let us consider how many people are in a specific Zip Code, especially in places that are super heavily populated... 87% chance? doubtful.
87% of all statistics are pulled from /dev/ass, including this one.
greg, REMEMBER ED CURRY!!!
The suit turns on the question of whether Netflix should have known that their dataset's anonymity could be broken, two years before researchers demonstrated that.
This is called a "state-of-the-art" defense, and generally doesn't work.
State of the art defense is the defense that permits a manufacturer to avoid liability in a design defect case if at the time of manufacture there was no safer design available, or in a failure to warn case if at the time of manufacture there was no way the manufacturer could have known of the danger he/she failed to warn against.
Lets say I was making Asbestos oven mitts, no one knew it was dangerous. The state of the oven mitt industry and materials science (the art) was that Asbestos was fine. Then, 50 years later we find out it's dangerous. The lawsuits will probably prevail because the "state of the art" defense doesn't stand up to strict liability.
On the upside, she'll probably make some new friends in PTA. And who doesn't love hot buttered soccer moms?
THL phish sticks
When the movies are sent to you, do they arive in any type of packaging that indicate what type of movie you are getting? Or, as when ordering from an adult store it comes in generic brown paper with an alias name of the company that sent it? I always look foward to my brown paper mail deliveries by the way.
Could who ever delivered the package to her door have figured out her taste of movies?
Which brings up another question I have. When ordering the movies, is the order a post card type of request or a sealed envelope?
Shoot, how many marketing firms have been sued for this same type of privacy issue?
5 out of 11 people in the Cleveland ohio(now you know where I live, kind of) like to watch porn who were born in 1959(how old am I really?) or earlier. Males make up 80 percent of this catagory with women all admiting they LOVE porn(Am I male or female).
I understand her concern for privacy, I just don't see enough information being provided to support that concern.
Anonymous comments are as pathetic as the anonymous "sources" that contaminate gutless journalism from the New York Time
I didn't RTFA, but exactly how detailed is this information? Will my facade of sophistication bolstered by my renting/viewing of foreign films remain intact? Or will it be torn asunder when it is revealed I only fast forwarded to the sex scenes?
Movies are sent in a red Netflix envelope. There is a perforated piece of paper with your address which covers one side of the envelope (it covers the side with Netflix's return shipping address--the envelope you receive is the envelope you ship it back in.)
The movie itself is in a sleeve inside the envelope. The sleeve contains the movie, a description, and a barcode. A correctly inserted movie will only have the barcode revealed through a little window, presumably to make processing easier at the shipment facilities.
See:
http://blogs.courierpostonline.com/mojodojo/files/2009/03/netflix-1.jpg
http://i.zdnet.com/blogs/netflixenvelope.jpg
So anonymity in this case was simply a type of encryption. Making information less obvious doesn't mean the information is lost. True anonymity can only be achieved by purging information, and hence only no information is truly anonymous. Or is it?
Cracking google's anonymity code is another related topic. It is good that these companies anonymity cards are being challenged.
Keep in mind that strict liability only applies to tort cases and that the plaintiff must demonstrate that the tort actually happened. That is not the case in this case. Yeah, I said it that way.
There are a lot of comments about whether she could/should have been outed by the data set or not, but nobody answered the question as to whether Netflix really should have known that the data they published was personally identifiable. The answer is an unmitigated 'yes'. People have been working on privacy in databases issues for a very long time (US census in the late 1800s, anyone?) but with the EU's strict privacy regulations, it's 'recently' become an important area of computer science research (see the "Privacy in Statistical Databases" conferences like PSD 2008).
The fact is though that there are several standard techniques that are well understood, easy to implement, and would have let them release the data without releasing any information. Probably the best fit would be to just lie about the zip codes -- take the data and make sure that there are at least n people in each "zip code", merging adjacent codes into one until you get enough to protect the innocent. There's also a lot of research about generating fake records that maintain similar statistical properties to the original data set. Both techniques do result in some loss of information, but remember that's a good thing because it helps protect the privacy. Besides, if for example I live in Green Bay, I really fail to see how much additional information can be gained by associating my records with the individual zip code for Green Bay instead of grouping everyone together into a single zip for the entire city.
Netflix sends movies in a Tyvek sleeve with a label with the title, plot synopsis, and a few other details. This, in turn, goes in a paper envelope that hides everything on the label, except a barcode. I don't know what is encoded in the barcode; if I had to guess, it's a unique identifier Netflix uses for inventory purposes. Without a way to tie that to a movie title, the only way someone's going to know what you're ordering from Netflix would be if someone intercepted your mail and pulled out the movie.
20 January 2017: the End of an Error.
Thank you for the links. Like I said, i do not belong to NetFlix so I did not know for sure how the packaging was done.
Anonymous comments are as pathetic as the anonymous "sources" that contaminate gutless journalism from the New York Time
Most repliers are unsympathetic to this complaint, but if this dataset was hooked up to an online tool which quickly did the look-up it would be a major issue.
I've done enough work for companies in my years to know that zipcodes can be used to uniquely identify individuals. Since there are still parts of this country in which a person may own a very large piece of land and Zipcodes use the +4 to determine specific blocks within a zip code range, then all one needs is a name or the other info mentioned above to uniquely identify a person. This has been known by banks and the post office for as long as the +4 has been around. Banks have strict guidelines around uniquely identified people and what they must do if they are identified when dealing with offers of credit.
Netflix works with the post office for mass mailing, they would be aware of the ways to uniquely identify people.
Life takes interesting turns, but the most interest is when you're off the beaten path.
10-14-1986, male, 64111.
Go!
Netflix did not give out zip code, age, or gender. That was being offered in the second phase of the contest.
*****
The suit is also asking the court to stop Netflix from launching its promised second contest to improve the recommendations — this time giving out user data that includes ZIP codes, ages and gender, along with movie ratings and ID numbers substituted for user names
****
The actual data given out, which the law suit was filed against says 2 data bases with only the following description of what was in it.
****
In order to get a better movie recommendation algorithm, the online DVD rental company gave more than 50,000 Netflix Prize contestants two massive datasets. The first included 100 million movie ratings, along with the date of the rating, a unique ID number for the subscriber, and the movie info. Based on this data from 480,000 customers, contestants had to come up with a recommendation algorithm that could predict 10 percent better than Netflix how those same subscribers rated other movies.
****
Ok, I'm confused. I do not see anything in the first descript that would identify a person. The zip code, age, gender was NOT given out. YET.
They talk about two data base's, but only describe the contents of one. They talk about future release of data. Which is age, gender, zip. But it hasn't been given out.
What type of information was in the second data base.
What specificatly is the data the origional law suit was filed about??????
I know, stupid questions.. But there is somethign missing here, or I'm just stupid. or blind(this I will admit to).
Anonymous comments are as pathetic as the anonymous "sources" that contaminate gutless journalism from the New York Time
I agree this was poor and invasive on netflix's part, but how was "suit known as Doe v. Netflix " "outing a lesbian"? Like netflix released information of a bunch of gay movie rentals??? releasing private info is fail, I just don't see the correlation with the lesbian woman.
If one my co-workers (of either sex) told me they loved me, that would make working with them at least a little uncomfortable.
Getting in trouble just for the fact that you're attracted to people of whichever gender is wrong.
Getting in trouble for making a co-worker uncomfortable by telling them you love them is a legitimate thing. It's totally inappropriate.
I will admit that in the past I did once have a crush on a male co-worker, but I would never have let him know. People have to behave professionally in the workplace so that everyone can be comfortable working there.
Putting moderation advice in your
Netflix automatically keeps track of your "favorite genres". There is a top level genre "Gay & Lesbian", not to mention a pack of sub-genres with similar names. If you found out your mother's or sister's NetFlix account had those at the top of the list, wouldn't you at least wonder? Imagine if those were your "favorite genres" and your worst enemy/boss/husband/wife saw that. Wouldn't they wonder about you? While they may not know you're gay, they would wonder...
I'm not sure whether the litigators have read this particular section of the Netflix prize rules:
So yes, you can match a set of reviews with someone else, but how will you know that it's really a person and not a random coincidence? 0.5 million review traces give plenty of opportunity for a false positive match. Netflix learned from AOL's data release disaster, which resulted in a few people getting fired.
The movies are ordered online, on a plain old http: rather than https: page. So no, not a sealed envelope.
Quis metamoderunt ipses metamoderatores?