Slashdot Mirror


Why Anonymized Data Isn't

Ars has a review of recent research, and a summary of the history, in the field of reidentification — identifying people from anonymized data. Paul Ohm's recent paper is an elaboration of what Ohm terms a central reality of data collection: "Data can either be useful or perfectly anonymous but never both." "...in 2000, [researcher Latanya Sweeney] showed that 87 percent of all Americans could be uniquely identified using only three bits of information: ZIP code, birthdate, and sex. ... For almost every person on earth, there is at least one fact about them stored in a computer database that an adversary could use to blackmail, discriminate against, harass, or steal the identity of him or her. I mean more than mere embarrassment or inconvenience; I mean legally cognizable harm. ... Reidentification science disrupts the privacy policy landscape by undermining the faith that we have placed in anonymization."

2 of 280 comments (clear)

  1. Re:Damn voyeurism is all it is by winkydink · · Score: 0, Flamebait

    But the voyeurism slant isn't newsworthy. Oh wait. Neither is this.

    --

    "I'd rather be a lightning rod than a seismometer." -Ken Kesey

  2. This parallels encryption by DontLickJesus · · Score: 0, Flamebait

    All persons whom understand encryption also understand that there is no such thing as perfect encryption. Anonymizing(sp?) data works using roughly the same methods as encryption, and there is no such thing as an unbreakable encryption. We can only hope for "acceptable". I'd assume the most acceptable means of anonymizing data would be to allow the user to first choose what gets scrubbed out, followed by a sort of data "blacklist" compiled by experts. The real problem here is that companies selling this data have a vested interest in never getting it quite right.

    --
    Where genius and insanity become confused true wisdom is found