AOL Releases Search Logs of 657,427 Users
An anonymous reader writes "AOL has released the search logs of over 650,000 users for research purposes. This looks like it may become a public relations disaster for AOL, as well as a privacy nightmare for the users involved as Michael Arrington of TechCrunch notes: "AOL has released very private data about its users without their permission. While the AOL username has been changed to a random ID number, the ability to analyze all searches by a single user will often lead people to easily determine who the user is, and what they are up to. The data includes personal names, addresses, social security numbers and everything else someone might type into a search box." This is also being covered on The Paradigm Shift and Oh My News."
fantomas adds " Looks like they've just taken it down but it's still available on The Pirate Bay; not sure why but some of the academic researchers are going crazy musing the ethical aspects of letting the world know who's searching for how to kill their wives ..."
Update: 08/07 21:32 GMT by T : amromousa writes "AOL is now apologizing for the release ..., calling it a "screw-up," which they're upset and angry about."
AOL users! ;)
You insentive clod! The end of the world is a geographic location! Not everyone has been sold on the junk science of the round earth!
Where were you when the voynix came?
I hope that Google will now mark aol.com as an unsafe website to visit.
A friend of mine downloaded this dataset.
A teacher's credit union employee was searching for sexy underwear, how best to conduct a relationship with a co-worker, and have sex in a pickup.
Just before that, she was searching for cars. And appears to have cancer as well, or lives with someone with cancer. Maybe it's her sick husband.
I wonder if that demonstrates why someone wouldn't want their Google searches or AOL info to make it into the public realm. AOL is obviously a bastion of consumer rights.
Saskboy's blog is good. 9 out of 10 dentists agree.
I have. I want to know if it's out there anywhere on the public internet. Same reason I search for my phone number, full name, etc.
657,437 searches for "how to cancel AOL"
I've read of someone who tried it only to find that a group/department at his college had is SSN# posted
Of course, a partial SSN with a wildcard match might be a better idea.
Ahh...great...maybe I can expect a call from authorities if Google ever caves. I got one of those stupid ICQ Child Porn spams one day and started googling for reporting agencies. Not that I think it would do much good, but hey...I would rather have reported it and have it do nothing than to not have reported it and have no chance of it doing anything.
In Soviet....err...In America the government watches you! Ahh...how the times have changed...Working on losing the 1st Ammendment and 4th Ammendment in 8 years. As Thomas Jefferson said "The beauty of the 2nd Ammendment is that you don't need it until the government tries to take it away"... I recently had a picture taken of my baby girl at the National Archives with those 3 terribly important documents honestly wondering if they will mean anything or even exist by the time she is old enough to show her kids the picture.
But hey...may just be me being a pessimist...so maybe the spooks won't get up and arms datamining slashdot and seeing my TJ quote and come interrogate me for being a terrorist...just in case...
Last post!
The only change I can believe in is what I find in my couch cushions.
But I thought AOL was the Internet? Now I'm confused..
The file is available here:
http://www.gregsadetsky.com/aol-data/
There are 14 mirrors listed there. They have all been added after this first mirror went live less than 20 hours ago.
I have already transferred 863Gb of data in that short period of time.
Loopsh of fury.
so I don't really see the privacy issue
Then you're an idiot. The info itself can contain private info, and being linked by ID makes it much easier. Imagine this set of searches:
Susan Smith phone number
britney spears
Smallville high school
shoe store near smallville
Smallville abortion clinic
dr. joe jones
6 searches and already we can assume the user lives in smallville, is young, knows susan smith, and is looking for information on abortions.
Now, if instead of 6, we had every search for a month or two. How much more information about this "anonymous" user do you think we could find?
FTA:
Mmmmmm. . . Steak and cheese. . .
What?
13455621 how to fucking bury someone
13455621 funky gibbon
13455621 chair repairs seattle
13455621 addams family
13455621 OSS cancer
13455621 FUD spreading
Also, a couple weeks ago I booked a room at a hostel over the internet, and apparently I mistyped my credit card information, so they asked me if I could to to them again over email. You know, I just said "No, I'll call you."
I send my credit card numbers over email all the time. But I only use "throw-away" numbers that are generated on the fly and can only be charged by a single vendor up to a specific amount (pre-set by myself). Most of the big card issuers offer a similar service for free (last I heard, MBNA, which has offered it for at least 5-6 years now, has not had a single instance of succesful fraud involving such throw-away numbers, never mind free, they ought to be paying me to use the service).
Back in January, related to the story on how the DoJ demands and gets ISP data, AOL had said that "We did not comply with the request made in the subpoena," spokesman Andrew Weinstein said. "Instead, we gave the Department of Justice a list of aggregate anonymous search terms that did not include results or any personally identifiable information."
AOL- you need to rethink that phrase personally identifiable, because it doesn't seem to mean what you think it means. You're hiding behind one technical definition of PII, without concern about whether or not the results actually have PII. If you're releasing results with personally identifying information, then you cannot say you're not releasing PII. I'd written in January I'd writen "I question this assumption by Yahoo, AOL, etc. that search terms, by themselves, have no privacy considerations because they've been separated from personal info. What if the search itself contains personal information? Are the search companies deleting the timestamps and randomizing the order of the search terms themselves? Because otherwise I could see personal info showing up." Obviously, half a year later, they still think that replacing a name with a number takes away the PII. They need to have a talk with, say, the Census Department, about why the department will withhold data about *groups* of businesses in a region. Grouped data can easily become PII data if you can tease out characteristics. AOL didn't even group the data!
As always, relevant quotes from the best.essay.evar on why privacy is a fundamental human right: "If information that is actually about someone else is wrongly applied to us, if wrong facts make it appear that we've done things we haven't, if perfectly innocent behavior is misinterpreted as suspicious because authorities don't know our reasons or our circumstances, we will be at risk of finding ourselves in trouble in a society where everyone is regarded as a suspect. By the time we clear our names and establish our innocence, we may have suffered irreparable financial or social harm..."
"...agents of the state in Canada cannot order Canada Post to photocopy the address on every envelope we send, nor can they order bookstores to keep a record of every book we buy, let alone of every page of every magazine we leaf through. There is no reason why they should be able to exercise such powers with regard to every e-mail someone sends or every Web site he or she visits."
"I do not see any reason why e-mails should be subject to a lower standard of privacy protection than letters or telephone calls. And I do not see why Internet browsing should be subject to a lower standard of protection than book purchasing or researching in a reference library. Canadians should not be subject to greater state monitoring or scrutiny just because they choose to use new communication technologies."
- Go to http://www.ssa.gov/employer/statewebcali.htm and pick an SSN prefix for a particular state (say, CA, which is from 545 to 573).
- Go to Google, click Advanced Search, and in "With all of the words:" enter "SSN".
- In "Return web pages containing numbers between" enter 545000000 "and" 574000000.
- Click Search and stare in horror all the student listings, bankruptcy filings, etc. posted with names, SSNs, addresses, etc.
I'm sure I'm not the first to think of this, but if you abuse any of this information, the Erinyes will come after you!