AOL Releases Search Logs of 657,427 Users
An anonymous reader writes "AOL has released the search logs of over 650,000 users for research purposes. This looks like it may become a public relations disaster for AOL, as well as a privacy nightmare for the users involved as Michael Arrington of TechCrunch notes: "AOL has released very private data about its users without their permission. While the AOL username has been changed to a random ID number, the ability to analyze all searches by a single user will often lead people to easily determine who the user is, and what they are up to. The data includes personal names, addresses, social security numbers and everything else someone might type into a search box." This is also being covered on The Paradigm Shift and Oh My News."
fantomas adds " Looks like they've just taken it down but it's still available on The Pirate Bay; not sure why but some of the academic researchers are going crazy musing the ethical aspects of letting the world know who's searching for how to kill their wives ..."
Update: 08/07 21:32 GMT by T : amromousa writes "AOL is now apologizing for the release ..., calling it a "screw-up," which they're upset and angry about."
Finally, for all my support nightmares AOL users I know (and there are many!) that I endured over the years, a misstep that may offend and bother them as much as supporting AOL has bothered me for the last bazillion years. Go away AOL! (But, leave a few of your coasters at the store counters, those did come in kind of handy.)
So, all of that aside (the court of public opinion stipulates AOL as stupid and insensitive), how equally egregious and offensive is others would propogate and perpetuate this misguided release of data? Any mirrors still carrying this information (and they are there) serves few purposes for continuing to provide access, and none are defensible: either they are happy and willing to allow potentially embarassing or damaging data to continue to be distributed, or they are sticking it to AOL when AOL has already fallen on their own sword -- enough is enough. It's not okay.
(So, how many wives are either not going to be home tonight, or are going to fix hubby his very favorite dish?)
personal names, addresses, social security numbers and everything else someone might type into a search box.
Who in their right mind would type their social security number in a search box, in plain text??? I mean, really???
I got nothin'
Way to jump to conclusions. How do you know that they weren't working on a screenplay, or simply trying to find a phrase they heard mentioned somewhere?
If "End of the world" was searched for, how do you know if they are looking to the lyrics for an REM song, or trying to build a WMD?
Since most people search for their own name, this really isn't very private. I imagine law enforcement may use this to track AOL users. I wonder what the legal implications are...
You insentive clod! The end of the world is a geographic location! Not everyone has been sold on the junk science of the round earth!
Where were you when the voynix came?
I hope that Google will now mark aol.com as an unsafe website to visit.
Company calls data posting a mistake.
Hmm, I wonder if this "sorry" will be enough
A friend of mine downloaded this dataset.
A teacher's credit union employee was searching for sexy underwear, how best to conduct a relationship with a co-worker, and have sex in a pickup.
Just before that, she was searching for cars. And appears to have cancer as well, or lives with someone with cancer. Maybe it's her sick husband.
I wonder if that demonstrates why someone wouldn't want their Google searches or AOL info to make it into the public realm. AOL is obviously a bastion of consumer rights.
Saskboy's blog is good. 9 out of 10 dentists agree.
It occurs to me that it would be pretty difficult to trace back to the user who is doing the searching by knowing what they are searching for. Sure I have Googled myself and have entered my address into Google Maps, Map Quest, etc. But I have Googled about a hundred other people and thousands of addresses. It would be an interesting game of what do all these things have in common for someone to triangulate all this information back to who I am. Granted I have never done a search on my or anyone elses Social Security Number, that's just asking for it.
This is the last nail in the coffin for AOL I would say. This is a horrible invasion of privacy for people. Many people, myself included have probably searched for our own names, addresses, cities, credit card numbers, etc. I really hope that an attorney somewhere sues AOL into oblivion over this.
Some intresting tidbits:
17556639 how to kill your wife 17556639 how to kill your wife
17556639 wife killer 17556639 how to kill a wife
17556639 poop 17556639 dead people
17556639 pictures of dead people 17556639 killed people
17556639 dead pictures 17556639 dead pictures
17556639 dead pictures 17556639 murder photo
17556639 steak and cheese
17556639 photo of death 17556639 photo of death
17556639 death 17556639 dead people photos
17556639 photo of dead people 17556639 www.murderdpeople.com
17556639 decapatated photos 17556639 decapatated photos
17556639 car crashes3 17556639 car crashes3
160689 light brown colored semen 3/2/2006 16:30 9 http://experts.about.com/
6497dog eat monkey5/22/2006 5:39
6497dog eat monkey5/22/2006 5:39
6497capuchin monkey dog5/22/2006 5:39
6497dog eating monkey5/22/2006 5:40
6497dog eating monkey5/22/2006 5:40
6497dog eating monkey5/22/2006 5:40
6497dog eats monkey5/22/2006 5:40
6497dog eats monkey5/22/2006 5:41
6497eating capuchin monkey5/22/2006 5:41
6497eating capuchin monkey5/22/2006 5:41
6497eating capuchin monkey5/22/2006 5:41
6497kill capuchin monkey5/22/2006 5:41
6497killing capuchin monkey5/22/2006 5:41
6497slaughter capuchin monkey5/22/2006 5:42
6497feeding capuchin monkey5/22/2006 5:42
6497feeding capuchin monkey5/22/2006 5:42
6497eyes capuchin monkey5/22/2006 5:42
6497tail capuchin monkey5/22/2006 5:42
6497tail capuchin monkey5/22/2006 5:43
6497tail capuchin monkey5/22/2006 5:43
6497beach stud speedo5/23/2006 1:24
6497beach martin ricky5/23/2006 1:24
6497beach martin ricky5/23/2006 1:25
6497beach martin ricky5/23/2006 1:25
6497beach martin ricky5/23/2006 1:25
6497beach martin ricky5/23/2006 1:25
6497beach martin ricky5/23/2006 1:27
6497beach martin ricky5/23/2006 1:27
6497beach martin ricky5/23/2006 1:28
6497beach martin ricky5/23/2006 1:28
6497beach martin ricky5/23/2006 1:28
6497beach martin ricky5/23/2006 1:28
6497beach martin ricky5/23/2006 1:29
6497-5/23/2006 1:55
6497-5/23/2006 1:55
6497recent5/23/2006 1:55
6497speedo triathlete5/23/2006 1:55
3302children who have died from moms postpartum depression
3302children who have died from moms postpartum depression
3302rotovirus2006-03-24 19:55:12
3302statistics on infancide
3302statistics on infantcide
3302statistics on infanticie
3302statistics on infanticide postpartum depression
3302statistics on infanticide postpartum depression
3302statistics on infanticide postpartum depression
3302pictires of tom cruise and his wife
3302people magazines pictures of tom cruise and katie holmes
2652898my space.com (about 100 times)
2652898different ways to jerk of
2652898how to not ejaculate so early
2652898my penis has a big erection
2652898free videos of big dicks
Thanks to FARK.com for the snippits.
657,437 searches for "how to cancel AOL"
Ahh...great...maybe I can expect a call from authorities if Google ever caves. I got one of those stupid ICQ Child Porn spams one day and started googling for reporting agencies. Not that I think it would do much good, but hey...I would rather have reported it and have it do nothing than to not have reported it and have no chance of it doing anything.
In Soviet....err...In America the government watches you! Ahh...how the times have changed...Working on losing the 1st Ammendment and 4th Ammendment in 8 years. As Thomas Jefferson said "The beauty of the 2nd Ammendment is that you don't need it until the government tries to take it away"... I recently had a picture taken of my baby girl at the National Archives with those 3 terribly important documents honestly wondering if they will mean anything or even exist by the time she is old enough to show her kids the picture.
But hey...may just be me being a pessimist...so maybe the spooks won't get up and arms datamining slashdot and seeing my TJ quote and come interrogate me for being a terrorist...just in case...
Last post!
The only change I can believe in is what I find in my couch cushions.
No privacy issues? Just look at some of the data that you can link to a specific user ID over that 3 month period. It is not too hard to figure out who it is. As TFA points out, many people type in their own name to search engines to see if they show up anywhere on the internet. Tied with birth dates, horoscope searches, SS #'s etc, it is not too hard to figure out who a particular user is.
But wait, you were being sarcastic right?
"To strive, to seek, to find, and not to yield." - Tennyson
The file is available here:
http://www.gregsadetsky.com/aol-data/
There are 14 mirrors listed there. They have all been added after this first mirror went live less than 20 hours ago.
I have already transferred 863Gb of data in that short period of time.
Loopsh of fury.
so I don't really see the privacy issue
Then you're an idiot. The info itself can contain private info, and being linked by ID makes it much easier. Imagine this set of searches:
Susan Smith phone number
britney spears
Smallville high school
shoe store near smallville
Smallville abortion clinic
dr. joe jones
6 searches and already we can assume the user lives in smallville, is young, knows susan smith, and is looking for information on abortions.
Now, if instead of 6, we had every search for a month or two. How much more information about this "anonymous" user do you think we could find?
FTA:
Mmmmmm. . . Steak and cheese. . .
What?
13455621 how to fucking bury someone
13455621 funky gibbon
13455621 chair repairs seattle
13455621 addams family
13455621 OSS cancer
13455621 FUD spreading
I hate to break it to you, but there are a ton of stories out there dealing with morbid topics. Either seriously (e.g., horror stories, a la Lovecraft or Edgar Alan Poe) or as a sort of dark/macabre humour.
And especially pay attention to the last alternative: there are a lot of stories and sites that are just supposed to be obviously humorous, not actually to be a DYI guide to the subject in their title. E.g., I think there was a humorous site somewhere titled something like "how to pick up underage girls", or something to that effect, and it wasn't actually a paedophile's field guide. E.g., take sites like the Evil Overlord's List, which are just a parody of common movie cliches, not actually a guide to be followed by someone. (Unless they're writing a story involving a stereotypical Evil Overlord.)
So how do you know if that guy didn't google for the title of such a story? Or for some random phrase he remembered from one?
E.g., I remember reading an absurdist play by Eugen Ionesco about some murderer who tempted people to come see the colonel's photo, and then pushed them into some lake. What if I googled for that? Remember, I don't know the title of the play any more, so I can't just google for that. Not that it would make it any better, because the title IIRC was something about an unpaid assassin.
The whole thing didn't even make much sense, other than maybe as a metaphor for something or another. It's an absurdist play, so don't ask me for what it was a metaphor. It contained such gems as the everyman hero asking a police officer something to the effect of "and didn't you send cops to get him?" and getting an answer like "yeah, but they too wanted to see the colonel's photo." Nowhere does it say what colonel or what's special about that photo. I guess it wouldn't be absurdist if it did.
So if I tried googling for that play on the net, would you use your amazing deductive powers to conclude that I'm looking for a hitmal willing to do some pro-bono work? Maybe to whack-off some colonel?
A polar bear is a cartesian bear after a coordinate transform.
You should be logging the ips downloading the file and leak that in a few days...
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
Smallville abortion clinic? Did this person get knocked up by Clark Kent?
Thou shalt not begin a subject line or post with the word "Umm".
I'm absolutely stunned by the number of people who are on one hand saying "This is evil! We must protect privacy!" and yet at the same time have downloaded the list and commented on the information therein.
Back in January, related to the story on how the DoJ demands and gets ISP data, AOL had said that "We did not comply with the request made in the subpoena," spokesman Andrew Weinstein said. "Instead, we gave the Department of Justice a list of aggregate anonymous search terms that did not include results or any personally identifiable information."
AOL- you need to rethink that phrase personally identifiable, because it doesn't seem to mean what you think it means. You're hiding behind one technical definition of PII, without concern about whether or not the results actually have PII. If you're releasing results with personally identifying information, then you cannot say you're not releasing PII. I'd written in January I'd writen "I question this assumption by Yahoo, AOL, etc. that search terms, by themselves, have no privacy considerations because they've been separated from personal info. What if the search itself contains personal information? Are the search companies deleting the timestamps and randomizing the order of the search terms themselves? Because otherwise I could see personal info showing up." Obviously, half a year later, they still think that replacing a name with a number takes away the PII. They need to have a talk with, say, the Census Department, about why the department will withhold data about *groups* of businesses in a region. Grouped data can easily become PII data if you can tease out characteristics. AOL didn't even group the data!
As always, relevant quotes from the best.essay.evar on why privacy is a fundamental human right: "If information that is actually about someone else is wrongly applied to us, if wrong facts make it appear that we've done things we haven't, if perfectly innocent behavior is misinterpreted as suspicious because authorities don't know our reasons or our circumstances, we will be at risk of finding ourselves in trouble in a society where everyone is regarded as a suspect. By the time we clear our names and establish our innocence, we may have suffered irreparable financial or social harm..."
"...agents of the state in Canada cannot order Canada Post to photocopy the address on every envelope we send, nor can they order bookstores to keep a record of every book we buy, let alone of every page of every magazine we leaf through. There is no reason why they should be able to exercise such powers with regard to every e-mail someone sends or every Web site he or she visits."
"I do not see any reason why e-mails should be subject to a lower standard of privacy protection than letters or telephone calls. And I do not see why Internet browsing should be subject to a lower standard of protection than book purchasing or researching in a reference library. Canadians should not be subject to greater state monitoring or scrutiny just because they choose to use new communication technologies."
This may surprise you but guess how many searches for slashdot or people using slashdot.
Take a look here for the building archive.
Ok fess up.. WHO on here is using slashdot that is an AOL lover. For a long time we have poked jokes at AOLers but it seems they are in our midst.
Nice to see you care so much about users' privacy that you're willing to distribute half a million users' private data.
Oh, but they're AOLers, so they don't have any rights. Rights only apply to the technologically literate, I suppose. Never mind then.
I use google's package tracking number all the time -- seems like some other people enjoy this, too.
... and so on ...
user-ct-test-collection-01.txt:11218337 http to track the status of this shipment on line please use the following;http www.fedex.com tracking action track&tracknumbers
604041010003308 2006-04-28 18:31:15
This person lives in Stamford, CT and ordered a "SL150T-12 Battery" for Home Delivery (5.0 lbs.) from california. Their barcode got messed up in-transit. Left at front door. Signature Service not requested.
user-ct-test-collection-01.txt:2433634 tracking 9102013196683232299662 2006-03-19 17:33:48
Your item was delivered at 8:54 am on March 24, 2006 in CROWLEY, LA 70526.
user-ct-test-collection-01.txt:5736530 ups tracking number 1z05r57w0299803522 2006-04-12 04:01:29
Delivered on: 04/12/2006 9:59 A.M. Delivered to: SOUTH BELOIT, IL, US Service Type: 2ND DAY AIR
user-ct-test-collection-01.txt:11989465 ups tracking 1z5628500342774976 2006-05-31 17:14:22
Delivered on: 05/31/2006 6:12 P.M. Delivered to: FORT WAYNE, IN, US Service Type: GROUND
user-ct-test-collection-02.txt:2103248 tracking 91025562344468252800 2006-03-02 02:11:13
There is no record of this item.
user-ct-test-collection-02.txt:2371993 tracking 1z7e49v20341755740 2006-05-08 12:22:41
Delivered on: 05/08/2006 10:25 A.M. Delivered to: BOTHELL, WA, US Service Type: GROUND
user-ct-test-collection-02.txt:2749649 usps tracking 9121010521297356081254 2006-04-04 17:11:49
Info has been stored off-line, but USPS will send it to your email
user-ct-test-collection-02.txt:5847446 www.ups.com and enter the tracking number 1z00v4270380899979 2006-03-18 16:53:15
Delivered on: 03/20/2006 2:56 P.M. Delivered to: TEMPLE CITY, CA, US Service Type: GROUND
There were about 120 searches for UPS "1Z..." numbers. I didn't bother parsing for USPS & UPS numbers, but there are plenty of those, too. I'm sure you'd be able to pull some names when the signature service is requested.
HIV Crosses Species Barrier... into Muppets