The 7 Ways That People Search the Web
SpaceAdmiral writes "After the recent release of AOL search logs, Paul Boutin used the site splunkd.com to analyse the logs. His analysis groups searchers into seven categories: The Pornhound, the Manhunter, the Shopper, the Obsessive, the Omnivore, the Newbie, and the Basketcase. My favorite example search is in the Basketcase category: 'i hurt when i think too much i love roadtrips i hate my weight i fear being alone for the rest of my life.'"
Another reason to believe AOL is biased.
The seven ways that people post on Slashdot.
The First Poster - Although this phenominon has been addressed and has somewhat lessened, there are still echoes of "First Post". These people wait on a "Mysterious Furure" story as post stupidities just to get in first.
The Fisher - These posters, rarely named Bobby, check-in with a kingly posts to generate replies and nothing more. Their posts, perhaps at first, seem to make sense, but on closer review contain mnay misstakes, intentionally designed to garner replies.
The old-timer - These posters, who hang around slashdot land, have forgotten to move on. They post just to show off their low slashdot id. This makes some druel, and others comment that low id does not mean more intelligent. However, they're all wrong anyway.
The reposter - Reposters wait for old stories to come up again and find modded-up comments from the old stories to repost. If this is the first time such a story is up, they post a bunch of old buzzwords that realign synergistic paradigm shifts.
The soap stander - Soap-Standers have what to say, and don't care where they say it, such as about why Bush is beery good, and that the UN and its anonymous leader are drunkards, and no amount of coffee will help.
The idiot - Idiots can't count, post moronic comments, and quickly type in useless garbage to fill in a little more space.
Have you read my journal today?
Im obsessive, but dont blame me! Great analysis...
Who is asked to slow down every now and then.
You forgot number seven. Should it be a troll? Or perhaps you forgot Poland?
Beyond your ability to count, the article seems quite interesting. My PhD supervisor made an intesresting comment about Google the other day: he said that people at Google must have very interesting information concerning the trends of "common knowledge," this is, before September, 11, 2001 a Google search for "september wtc" would yield totally different results, which surely will show the most "common" of things that people was searching for.
Likewise, if you searched for "Katrina" in Google before August 2005, you maybe ended in the page of someone named like that.
These are basic examples of informaiton that can be obtained with the "time" factor of the Google logs. Remember that time gives another dimension to your data, which lets you extract more information from it. Something among tht lines of image-pattern recognition, it is easier to match patterns from a moving image than from a static image.
The funniest meta-comment I've ever read around here. Chacham fits in most of the groups he described.
Somethingawful posted what is presumably the first part in a series of gold from the AOL search logs: http://www.somethingawful.com/index.php?a=4016 These would definitely fit in the 'basketcase' category...
"'Yrch!' said Legolas, falling into his own tongue."
Is this like a Slashdot poll where we whine about missing options?
Where does Cowboy Neal fit into the 7?
Are politicians their own category, or are they basketcases, or Pornhounds?
Oh You POS
So was Neo a manhunter, an obsessive, or just an omnivore?
I'm not sure what category I fit in. I live in a padded cell, and just used AOL search for the first time to obsessively shop for Manhunter porn while eating a meat-and-vegetable stew.
Some attitudes replaced or by cgi optimizes
I know that I often can't recall websites I've been to once but want to revisit. I will, however, often remember the search terms that got me there -- sometimes very specific search terms, since I've narrowed it down from my first wide-net search.
For some reason I stubbornly don't use bookmarks often (as when you have too many, they quickly become worthless) so that obscure search term might be in my profile 300 times over the course of a year if it's a site that I visit daily from the office.
Then again, I post on Slashdot a ton... I'm sure it's pretty obsessive anyway.
"Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
Ok, a lot of this AOL search data is quite amusing, in a sad, pathetic way. Too many people are having their jollies over it, while secretly being scared someone's going to get a peek at their searching record when Google finally loses its mind and makes the data available. It's easy to laugh, and be downright frightened, but in the end, we type our searches in, click the button and don't give it another thought. People wish to judge (myself included); it was a survival instinct in a far distant past and now it manifests itself as a morbid curiosity with the lives of other people.
People come in all colors, size, and mental states, AOL users undoubtedly more so. SO in their you'll find your fair share of freaks or freak wannabes, but mostly you'll just find people trying to find out things. What makes them freakish is not what they type in, but what they do with the information.
GetOuttaMySpace - The Anti-Social Network
The people who switch Tor nodes for every search they perform, so that later, then don't end up having articles written about them calling them weirdos and porn-freaks. Sheesh, what's wrong with horses?
This is a sig. It is appended to the end of comments I post.
"For we are all the Pornhound, the Manhunter, the Shopper, the Obsessive, the Omnivore, the Newbie, and the Basketcase, sincerely, the Breakfast Club"
Probably most people on this board are too young to remember anyway....
The article is written by someone that works for splunk and has a bunch of links to a splunk server (currently responding too slowly to use) to show you the logs, and pointlessly mentions numerous times how he clicked something in Splunk(tm)(C) to get some results...
... it is nothing compared to the tremendous fallout that would befall the Interweb, should AOL ever unleash accidentally almost 13 years of collected AOL chatroom dialogue. It's one thing to see the search strings of User #24601, but quite another to see just what he says when emboldened by conversational anonymity. Of course, AOL would say now that they don't have that kind of data, that they haven't been logging chat since the earliest days of version 2.0 ... but come on, would you throw away all of that beautiful demographic fodder?
"Ye shall know the truth and the truth shall make you mad."
-- Aldous Huxley
One of the search results from the famous pornhound.
69 927 3d molestation and rape porn 2006-05-20 17:20:16 9 http://slashdot.org/
Now we know why this site is so popular.
Although AOL represents a certain niche market. i.e. it's heavily skewed towards n00bs.
I wonder if a similar Google sample would show different results or identify other archetypes?
I definitly fall into the "Omnivore" type. I would imagine most Slashdotters do.
Actually, maybe the Basket Case one is a better fit for most Slashdotters.
Execute? [Y/N] _
From TFA: The searches of AOL user No. 672368, for example, morphed over several weeks from "you're pregnant he doesn't want the baby" to "foods to eat when pregnant" to "abortion clinics charlotte nc" to "can christians be forgiven for abortion."
That, right there, tells you why we need to worry about "Uncle Sam" having access to *everyone's* search logs - search terms alone contain an implicit picture of what should be some of the most private aspects of your life. Now imagine if user number 672368 turns out to be, say, John McCain's daughter, and Karl Rove got his hands on this just before the Republican presidential primaries...
what do you think would happen? what do you think http://en.wikipedia.org/wiki/Joseph_McCarthyJoe McCarthy could have done with this kind of data? Write to your elected official and ask them these questions, and what safeguards they are putting in place to prevent any such abuse - and tell them you will be voting this fall. Then call your local news channel, and ask them to run a story on it, and ask the candidates for comment. The big networks won't start a story like this, but if a small station is lucky enough to get a clip of a politician stumbling over an answer, it'll be syndicated faster than you can say "feeding frenzy".
(and for those of you naive enough to think that Karl Rove doesn't have access to the equivalent government databases through some back-room contact or another, I have a bridge you might be interested in buying...)
who doesn't see the subtle self-depreciation the "idiot" category contained.
I never spellcheck and I freely admit it. Save your karma for more worthwhile "lol erorrs" replies
what we need to know is how much 1 type cross section with one another.
/. like crazy, trying to desperately be NOT terrible)
Example : Obsessive Pornhounds(typical behaviour: spends inordinate times in usenet, loves tenta..)
or Manhunter Shopper(typical behaviour : posts on craiglists under 10 different profiles, e/n queen at somethingcrappy or somethinsomething)
or perhaps Newbie Basketcase (typical behaviour: reloads
or heck maybe Newbie Pornhounds or Basketcase Omnivore..
Purely in the name of research of course.
Timang tinggi tinggi
parang sudah asah
alang alang mandi
biar sampai basah
I thought this was going to be a George Carlin skit.
In a way, it sort of is.
Innovation makes enemies of all those who prospered under the old regime... -- Machiavelli
This guy makes a lot of assumptions in his analysis. I often search for a single topic multiple times - not out of obsession, but to refine my search. Sometimes I didn't get what I was looking for the first time, so I'll go back and sift through the 2nd and 3rd pages. Sometimes I search again because I can't remember where the best page was. Each new search for the same topic may lead me to change my search target - at first I might be looking to buy a product at a major retailer, only to realize later that it might be available used. These are all reasons to repeat a search that have nothing to do with obsession. Also, the author may have labelled someone as "Obsessive" when they are searching for "texas real estate" when in fact they work in the real estate industry.
The article is an interesting read but I'm not buying into his category system.
... the first link in the article is to a porn site.
The porn site has now been slashdotted.
Get off my born, bitches!
I think your onomatopoeia is wrong.
Moo is the sound cows make. You're thinking of "baah."
sex,celebrities,porn,lesbian,voyeur,amateur
I think the interpretation of why users google the same words over and over again is wrong. It's not obsessive or OCD at all.
For me, I will goggle words that I know that will contain links that I want to see, but never remember to bookmark. It's much easier to just go to a search engine and type a keyword and scroll for the link in the first 10 hits, rather than go through your hundreds of bookmarks to find exactly the one you're looking for.
Based on some of the wacky and random things that have gotten sent to google by me. Mostly happens when I'm trying to middle click on a link to open it in a new tab, accidentally miss and end up activating that stupid middle click search thing that tries to find whatever was selected last.
Finally found the pref to kill that but it was annoying as hell.
The search data released by AOL could be great for research purpose. Even a stupid person will never release such kind of data. This seems very strategic.
If you analyze the search data you'll know that video market is growing rapidly. Search engines are surely driven by porn market. It explains why google was fighting for that data. It could have bought down their revenue. As search engines are useful for the development of internet, user data is useful for the development of future product because you know in advance who are the potential customers for the new product.
Spam: Any activity on internet to gain popularity without paying to advertising companies like Google.
"Do niggers have x-ray vision" Truly frightening. Also note the large religious influence in a lot of the searches.
Perhaps the 7th category is for people who miss the joke?
That would make the 7th category nothing but a subset of the 6th.
...that nobody knows how to spell "beastiality"?
But if you've googled yourself and other people, it's a little trickier to determine from the list which one is you.
Though if the list of names contains 25 celebrities and "Joe Smith," it might not be hard to narrow down. At that point, you're the guy in the red shirt who beamed down to the hostile planet with Kirk, Spock, McCoy and Scotty. Yeah, the monsters could kill anyone in the party, but it doesn't take much effort to guess who it'll be.
You've hit upon something there. Perhaps 90% of my wife's usage of the internet is visiting 4 sites: Moviefone, Hotmail, MSN games, and IMDB. Does she use the convenient bookmark function... nope! Instead, her preferred solution is to home page Google and search for the sites there. I've explained the inherent wastefulness of using search for something where just typing into the Firefox's address bar will do the trick... but no dice.
I do have fun with it and occasionally, block Google on my DNS and watch as she complains that the internet is down.
The Bitcher: Will endlessly complain about minor things. May add to a discussion, but in the wrong thread, so as to whore karma multiple ways.
I have freaks! I did something right...
The analysis denotes an astoundingly low level of understanding of how people actually use the web. What the author is seeing is absolutely normal and obvious. The only abnormal thing is his surprise.
The Pornhound. The fact that people search for porn on the web must rank as the discovery of the year!
The Manhunter. Who ever bookmarks other people's web pages? I just type the people's names in Google, and most people I know do just that. We are all manhunters I guess.
The Shopper. Same as above, who uses bookmarks? If I am interested in a treo 700 and I type it 37 times in 3 days, this just means that I find it more convenient to type treo 700, then select from the search results, that bookmark the result pages that I am interested in. And this is reasonable: why should I create bookmarks that become useless once I do buy the treo?
The Obsessive. See above. People that search often for A are simply people who don't bother creating a bookmark for some results about A. Big discovery.
The Omnivore. Ok, so when the pattern is complex, the author gives up. This is a really informative category.
The Newbie. Again, it must rank as one of the big discoveries of the year that there are newbies on AOL...
The Basket Case. This seems to be a repeat of "the omnivore", except that the author found these queries weirder.
Who posted this on Slashdot? It's not interesting research at all! It's junk!
927 3d molestation and rape porn 2006-05-20 17:20:16 9 http://slashdot.org/ ------ I'd run that search to see why slashdot popped up but I'm too scared that in the next AOL search release list there I'll be searching for 3d pr0n
Although it's much more amusing to think otherwise, these are most likely to be more than one person using the same account.
I would be even more amused if many of the search results would be non-zero.
Actually, now that I think about it, this makes me more disappointed in AOL. I mean, I'd expect "feces extraction" to produce at least one relevant result: the Goatse guy, for example.
This proves, however, that the categories are bitmasks and not discrete values.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Coz who doesn't want a post to the evil pink hole?
While the pr0n crowd gets its own category, it would seem those who use the Internet to illicitly acquire copyrighted materials would simply fall into a subcategory of the Obsessive, and not an important enough one to be mentioned in the article. What of those brave souls who search for cracks, keygens, nocd patches, torrents, dvd rippers, and the like? Are they less prevalant than some would have us believe, or perhaps because AOL appeals to a less tech-savvy demographic, its searches might underrepresent them.
Okay, the bit about Omnivores who hit IMDb all the time hits a little too close to home here. :)
I use IMDb as much as I use Google. A merging of those two would be quite convenient for me.
Oh, and let's throw in Wikipedia while we're at it. While it may not be as accurate as a paper-published encyclopedia, it's still a zillion times more accurate than the average one-off webpage you're likely to find on any given topic.
My number is lower than yours, newbie!
Boobies never hurt anyone. - Sherry Glaser.
It begins in a trailor park in Kentucky. "Well boyah. If y'all's old 'nuff to git that Con-fed-er-ate flag tat and yer first shotgun, I's thinkin' y'all's old 'nuff to be knowin' the REAL story bout this world of ours. Here, put on this bedsheet. No, t'ain't Hal-o-ween, this here's a family hair-loom."
I think it also says a lot about the sorts of ignorant and idiotic people that use AOL
Good job again, but do you think your ISP/proxy doesn't cache/store what all comes from your IP?
Don't get me wrong it is great and good to try to protect your identity and all that jazz, but your information is worth $$$ and that is why companies will go to great lengths to get it. So, if it helps you sleep at night to take those "precautions" then good for you. But I have no doubt that despite your best efforts someone somewhere has you profiled.
On a side note. Instead of trying to fight it, why not use it to your advantage. I don't click on the win a free iPod links, but I'll complete a survey here and there if I can get a discount on future purchases or things of that nature. Stats/usage patterns, opinions, etc are power and money. I have stats and opinions, so why not get something in return for them?
When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
Notice one key factor here: These people all use AOL. That's naturally going to self-select your data towards certain segments of the population which might exhibit different inclinations than rest of the group.
I am officially gone from
Those that RTFA and those that haven't.
~CYD
//Nothing to see here, please move along.
My personal opinion is that list would have been funnier if "pornhunters" were called "carnivores" to go with the theme... I object to being listed as an omnivore just because I know how to use a search engine, though. For everything, that is. Though I guess I haven't been labelled because this has to do with AOL and I knew to steer clear of them since I was a total newbie to the internet.
At least, before this leak -- as beautiful as it is, this might finally be the tipping point in getting Joe Average AOLer to understand the gravity of the drastic erosions of privacy the Western world has experienced since 9/11, and stop trusting the unencrypted text submission these logs prove we often so completely and utterly, soul-baringly do. And no one acts anywhere near the same when they have even the slightest feeling they're being watched (and, more importantly, judged). In a world where Diaries are implicitly public, who have you ever trusted more than your search bar?
Especially as, judging by these search logs, Joe Blow has a lot more to hide than even my cynical ass ever imagined. Might make some people realize the terr'rists aren't the only ones who'll be caught, charged, sentenced and executed for having something to hide.
And this leak has finally given credence to the long-cynically-mocked, longer-held Sci-Fi ideal that, in teh big, unknowable futar, all Art will be on, be of, Technology. And this horrific breach of privacy is also the greatest set of Artistic and statistical data to have ever been released to the public. I would say, since it's raw data and not just a single interpretation, it's more important than the Kinsey Report. Which is tragic, because it can never be allowed to happen again, if we want any semblance of a feeling of privacy and freedom in our civilization. It's becoming unexpectedly apparent that this will be the form of major (mainstream, big-A-)Art of the future.
Don't believe me? Read 'The Search Engine Confessions of AOL User 23187425' and tell me it expresses any smaller torrent of hte raw, beautiful essense of what it is to be human than any Keats or Basho;. And that's only one piece among the very many a quick search can reveal. Many more at SomethingAwful's special edition of the Weekend Web, one of the primary progenitors, whether it was intended to be or not, of this kind of art.
{
The Pornhound: Lust,
the Manhunter: Envy,
the Shopper: Greed,
the Obsessive: Gluttony,
the Omnivore: Sloth,
the Newbie: Anger,
the Basketcase: Pride
};
*This is my post-RTFA relational array.
I don't know... those kinda look like lyrics...
---k--
</stupid>
So maybe some of the Obsessive A A A B A B C A A B B behavior (describing people who type the same searches in again and again) represents people using the search bar as an easier way to get back to a specific page or set of pages than remembering and typing the entire URL.
While the pr0n crowd gets its own category, it would seem those who use the Internet to illicitly acquire copyrighted materials would simply fall into a subcategory of the Obsessive, and not an important enough one to be mentioned in the article
Some of us work at universities and we call it research. We have these things called Fair Use exemptions.
Besides, everyone knows we have a severe lack of pirates, which is causing the current global warming crisis.
-- Tigger warning: This post may contain tiggers! --
The mods must be on crack again. The parent poster is a well-known troll.
Man is a slave because freedom is difficult, whereas slavery is easy.
I would have thought that the seven deadly sins would have been better categories for grouping searchers by. eg Pr0n hunter == Lust.
See my art -> http://herbevore.deviantart.com
Well the same user had such queries as "video pics of men fucking mares and cows free", "killing voyeur neighbors who are satanic cult mem", "old russian nuns for sex", "hillary clinton for sex", "west indian troublemakers in harlem", "female collies afgans vaginas free pics-beastial", "black gay boy sex with overbites porn site free", "fat ass gay black teens sex .com", "explain why people disbelieve the obvious", "emigrating to japan if you are mixed with blk white", "why can't america die", "who were the nazi's of japan and why", "latino pre-teen boys who love older men", "why people say best is'nt good enough", "is george w. bush jr. a american nazi party member", "american nazi party propaganda", "tuskegee university and people who have nightmares about it after attending it", "extermination of niggers in all boros of nyc", "how to end nightmares in the home" and "what makes an adult bully tic".
Truly a scary person. Bzzz.
Try this fun search to see who searched for their SSN.
It's amazing the kind of information [click link to see, it's not just SSN] people put into a public search engine.
Even among AOL accounts, I doubt the use of AOL search field or its browser is that huge a percentage. You could use an outside browser with AOL dialup before there even were the first creaky attempts to rig a browser into the AOL client itself, around version 3 or so. Although clicking on an URL in email will use the AOL browser. Just saying the population being spunkd is even a smaller subset.
it's really hard to say that the AOL data released is representative of the larger search search habits of the Internet population as a whole, not only because it hewed strictly to AOL users, but also because it included only AOL users who were using the AOL client software -- an even more rarefied sub-species of the online animal, and one that is typically (although probably with some exceptions) a bit more novice, a bit more unsophisticated, and/or a bit less familiar with all the untended parts of the WWW outside the garden walls of AOL's manicured interface. that's not to say that they don't use the Web, but if their home base and persistent point of departure is always the AOL client, they are unique in so many ways that make it hard to extrapolate.
It begins in a trailor park in Kentucky
If you read this user's other searches, you'll find out that he most likely lives in public housing projects in NYC, probably in Harlem, and although he probably wants to "exterminate blk negroes in all boros in nyc", he's pretty much likely black or partially black. Considered his love for jazz going as far back as the 1940's, he might be pretty old, and even maybe fed up with the youth of his own colour, maybe a bit like Bill Cosby. This being said, it seems like his name might be Joseph Wendell Johnson Jr., and even that he might live at the 2871 on the 8th avenue in NYC, and that apparently he is disrepected and bullied by 'harlem negroes'.
This is all pretty scary and pathetic if you ask me.
You just got troll'd!
I think you mean he forgot Roland.
Please, for the good of Humanity, vote Obama.
Damn it
That's a big assumption. I imagine many people leave their internet on all day, or at least share it when it is on. Maybe some of these people are in fact more than 1 person. That might explain the weirdos and omnivores. Of course in a way that makes it even scarier (when someone is searching for school stuff then Daddy pops on to look for child rape videos). Still, you have to laugh at someone who wants to be an ordained minister whilst their partner is potentially looking for sex videos.
'Most of them would rather use a blow up doll'
'This group is exactly like an abortion/anti-abortion group'
'The group is a cesspit filled with babies like sinister'
'Jerry Springer has nothing on COLA'
Or over on uk.misc we have some UK nutter who says that MI5 has been trying to kill him since 1999. He knows this as Newscasters speak to him through the television. I regret to say I emailed him and asked why he just don't switch off the tv. He replied that he might miss something.
I saw that as well, and think it says something about AOLers.
Maybe people who don't know much about computers, don't know much about other things as well?
If you ever noticed, searching for a web address usually returns the website in question. If you enter the website in question, say your favorite p()rn site, then the dropdown bar shows what you entered. This is very bad if your computer is also used by your partner, who may have issues with said site. Even worse is if underage occupants of the household also use the computer without full supervision (quite different from unsupervised). If the sites are held in history, then it will take an active look-see to find these questionable sites, where if the dropdown has the site that may be offending on it one click shows the crime.
Keep this in mind when checking up on the delinquints - check history if they are suspicious. Chances are they did not think to clear it out.
On another note, where I work has a hellish proxy system (and old software that gets re-ghosted nightly) that blocks even legitimate, work related entries because many manufacurers home-pages have tags that are blocked. If you can skip the highly graphic entryway your problem is solved!
Phil
Laugh, it's good for you!
Sadly, the people who screw up URLs or don't use bookmarks are the kind of people who could most benefit from using OpenDNS, which does take common typos and automatically redirects you to the correct site.
Seven is a good number. We like Seven. But sometimes we need Eight. Or Nine...
It would appear that either the analysis indicates AOL is truly the onramp for the Information Superhighway (i.e., they're all actually Newbies), or it draws a faulty conclusion.
There are as many uses for search engines as there are results. One could wish to verify a fact - or learn something new - by looking for it in a variety of ways; this would falsely look like the Omnivore category.
The fact that a Researcher or Fact Checker category was not included indicates Mr. Boutin may need to continue analyzing the data.