Slashdot Mirror


AOL, Netflix and the End of Open Research

An anonymous reader writes "In 2006, heads rolled at AOL after the company released anonymized logs of user searches. With last week's announcement that researchers had been able to learn the identities of users in the scrubbed Netflix dataset, could the days of companies sharing data with academic researchers be numbered? Shortly after the AOL incident, Google's Eric Schmidt called the data release 'a terrible thing,' and assured the public that 'this kind of thing could not happen at Google.' Will any high tech company ever take this kind of chance again? If not, how will this impact research and and the development of future technologies that could have come from the study of real data?"

3 of 85 comments (clear)

  1. Opt-in by chiasmus1 · · Score: 4, Interesting

    There are people who do not really care if their search results are added to the collection that is released. If Google had an opt-in option for data that they were going to release to academic researchers, I would opt-in. I imagine that there are other people who do not care who is looking at their searches. Something that companies might consider if they wanted to release search results is the option for the users to see what information gets released.

  2. Responsibility and rewards. by palegray.net · · Score: 2, Interesting

    If companies don't do a thorough enough job of sanitizing statistical data before releasing it, they have to be prepared to deal with the consequences. I'm all for maintaining research access to large volumes of real-world data, but it does need to be obtained through responsible channels.

    All that said, I think an interesting question is: How can we build systems that appropriately compensate companies for access to their data, with strict enforcement of measures designed to thwart misuse of the data? Posters above have given links to research that provides frameworks for making sure data is safe for release; how would a good wrapper for such a system work to incorporate rewards for companies who participate?

  3. Medical records? by CheeseTroll · · Score: 3, Interesting

    This puts the idea of analyzing "anonymous" electronic medical records in an interesting light. Even without a name, SSN, or other ID that explicitly links a record to a specific person, could researchers cross-reference the data with other databases well enough to identify people via patterns in their health record? I'm guessing yes.

    For the record, it's not my intent to troll, but I do think it's something that future researchers will need to take into account to ensure people's privacy.

    --
    A post a day keeps productivity at bay.