Yes, Google Does De-List Pages; But When?
In 2004, when Google users discovered that the top search result for the word "Jew" was the anti-semitic site Jew Watch, Google ran a disclaimer in the space usually reserved for ads, explaining that their results only reflected the reality of link counts on the Web, and that they did not endorse any Web sites which appeared at the top of their listings. Now the disclaimer has been dusted off again, as the top result on Google Images for "Michelle Obama" is a picture of a monkey's face with Michelle's hairdo. (Ironically, it looks as if the original image would have fallen out of the rankings, if it hadn't been for a follow-up blog post about the controversy, which itself now comes up as the first result.)
I first heard about the controversy from Dennis Prager's column in which he takes a New York Times columnist to task, because the columnist complained about "racially offensive images of the first couple" that come up in Google searches. Prager was unable to find any examples from Googling "first couple" or "Michelle and Barack Obama pictures," so he concluded that the NYT columnist "wildly exaggerated, if not made up" his claims. I tried Google Image searches for "first couple," "Barack Obama," and some other terms, and I couldn't find anything controversial either. However, it only took 10 seconds to enter "first couple google images controversy" on the regular Google Web search and find multiple blog posts explaining what all the fuss was about. Back to Google 101 for Dennis.
Many of the blog posts refer to Google's disclaimer about not tampering with search results. Those on one side are urging Google to make an exception and "fix" the results, while others sagely observe that Google just reflects reality, it doesn't create it.
All of this punditry is starting from a premise that's wrong. Google has actually removed pages from their search results — not because the pages were illegal or because the webmasters were search engine spamming, but because of the page's "offensive" content. In the "Chester's Guide" incident, a councilman in Chester, England discovered that one of the search results for "chester guide" was a satirical page titled "Chester's guide to picking up little girls." Although the page itself was obviously just someone's idea of sick humor, a Chester city councilman (who admitted that he hadn't looked at the page, saying that the title told him everything he needed to know) urged Google to remove the page from their index. Google at first refused, but later manually blacklisted the page to prevent it from appearing in their search results.
Whether or not you think this was the right decision, probably depends on what you think is the purpose of Google. If Google's purpose is to return the most useful results, then it made sense to remove the link, as Danny Sullivan of Search Engine Watch argued at the time, since it almost certainly was not a useful result for people searching for "Chester Guide." On the other hand, if the primary purpose of Google is to reflect the reality of what pages on the Web feature certain words most prominently (combined with all the other factors that Google weighs, of course), then the results shouldn't be altered.
But more people should at least realize that it happened. The Google disclaimer doesn't precisely say that they never blacklist pages or modify search results ("Google reserves the right to address such requests individually"), but it seems to give most people the impression that that's the case. According to that crudest of Googling techniques for which novice searchers are so frequently lampooned, there appear to be about 400 times as many stories on the Web about the Google "Jew Watch" controversy (where Google stood their ground) as there are stores about the "Chester's Guide" incident (where Google caved).
And Google-number-three Matt Cutts posted on his blog back in March explaining why Google does not remove "offensive" pages from search results; over a hundred comments followed, debating the pros and cons of the position, but none of them mentioned the Chester incident or any other case where Google actually had removed pages except as a result of a court order. One isolated comment from "Anonymous" said:
This is not quite true. I know of at least one web site that was de-listed for containing illegal content and/or promoting illegal activity.
which may or may not have been a reference to the Chester Guide incident. And that was it.
Is this a lot of hay to be making over something that happened years ago? Well, for one thing, I doubt if it happened just once. Consider that the Chester Guide incident involved a public declaration of outrage by a city council, and a public statement from Google, and still hardly anyone knows that it ever happened. If other incidents occurred without those high-profile elements, it would be even harder to discover them now. We'll probably never know how many such incidents took place, unless someone sues Google (maybe the owner of a blacklisted website, or maybe the victim of a RipOffReport hatchet job wondering why that site hadn't been blacklisted long ago), subpoenas Google for a list of cases where pages were de-indexed, and publishes the list if it's not sealed by a court order.
But whether it was one time or a handful, consider that political candidates like Arnold Schwarzenegger and Al Franken got asked during their campaigns about things they did 20 years earlier, and it's fair to ask a candidate about their past, because it's the same person standing in front of you now. Why did you do that? Have you stopped? Why?
And in the big scheme of things, Google is probably more powerful than a single US senator or the governor of California. So, can't we ask? What are their real rules about page removal? Have those rules changed since the Chester's Guide controversy? Can they even tell us what their rules are, or do they consider it a trade secret?
It is well known, of course, that Google censors some results in their search engines branded for different markets like China and even in liberal democracies like Germany. But nobody would call that a slippery slope towards censorship in the US version of Google, because the censorship in the Chinese and German versions is done at the behest of the governments there. On the other hand, Google does admit that they will de-index pages which include credit card numbers or social security numbers (which are all too easy to find on the Web). This might not seem like a controversial position, but even this act of voluntary self-censorship may be dipping their toe in the water further than it seems. Most people do consider their credit card information more private than their home address. But surely there are people like J.D. Salinger who less about the privacy of their credit card number (which is easily changeable) than their home address (which isn't). If someone finds Salinger's address and posts it on the Web, should Salinger be able to demand that Google de-index the page? Why should Google cater to the majority who want to keep their credit card number secret, but not to the minority who care more about keeping their address secret? Another commenter on Matt Cutts's blog post asked:
"hi. I have a question. My mom 'googled' herself and it shows some of her medical problems. She wants/needs these pages removed from search engines."
Again, why shouldn't that be considered at least as private as a credit card number?
And finally, even Google's decision to display an "offensive results" disclaimer, for some results but not for others, raises the same "Where do you draw the line?" questions as the issue of page removal. The Michelle Obama monkey picture gets a disclaimer. But search for 'george w bush' and the first row includes a photoshopped (I think!) image of Bush flipping off the press. Does that warrant a disclaimer as well? (Maybe that's considered less unfair because, even though the picture is fake, it does depict something that actually happened.) The first image result for "bristol palin" is a photo of her engaged in underage drinking — a real photo, but probably unfair to call it the single most relevant photo of her on the Web.
So while Google might consider credit cards and social security numbers and search engine spam to be on one side of a "bright line," and everything else is served up without alteration, I think the line is blurrier than that, for at least those three reasons: (a) credit cards and SSNs are less private than some other that things that Google serves up anyway; (b) Google has unambiguously removed some content that fell outside that bright line, as in the Chester's guide incident, and (c) they make other "slippery slope" judgment calls about search results all the time (as in the question of when to show the disclaimer). So I hope that Google someday comes out with a more complete answer to the question. What is their real policy on what they will remove? The Chester's guide incident — would they do that sort of thing if the same situation came up today, or have their rules changed? If they want to go really deep, then is there a general set of principles from which their rules follow — explaining why, for example, they treat credit card numbers as more private than sensitive medical information? (Google did not respond to my request for comment, either through official channels or the unofficial back channels of friends who work there.)
I hope Google gives an answer some day. Even just to say, "It's a classified internal policy and that's all we're going to tell you." But once and for all, the answer is not "Google doesn't remove content just because it's 'offensive' or 'harmful.'"
Meanwhile, a modest suggestion about the disclaimer displayed above the search results: Put it where people will actually see it, in a separate line below the ads, but above the search results. Right now the link to the disclaimer is displayed as one of three ads across the top, and people don't look at the ads. But hey, people do buy ads, so if you push the disclaimer down a bit where people will read it, you also free up space for 50% more ad revenue!
Am I alone in thinking that whoever Bennett is, I have no interest in his vague ramblings?
[FUCK BETA]
If each IP adress can give a mark to each web page based on if they think the result is relevant and useful enough, then that should filter the "problems". On the other side... bye bye anonymity!
It seems to me like it would be in Google's interest to remain neutral regarding search results. Now that Google has started censoring sites at their discretion I would think this forces them to take responsibility for all the results they provide. They are no longer simply a neutral party providing indexed results.
Is there a firm algorithm for how Google ranks relevant pages, or is that a proprietary black box? Because if it is, I don't understand the problem - we are already unsure what they're doing behind the curtain, so who cares if they follow their usual algorithm for a page or treat it specially because someone finds it offensive? Incidentally, it'd be nice if Google kept their inner workings a mystery so we didn't have companies devoted entirely to increasing websites' rankings for more page views.
is just one step prior to suing them if someone obtained your credit card info via google and used it to rip you off.
Nullius in verba
Time for a new format of robots.txt:
User-agent: *
Allow: /
Conditions: only_when_delist_possible
Remarks: please_leave_your_name_when_done
If Pandora's box is destined to be opened, *I* want to be the one to open it.
It's not google that is hosting the questionable content so why should anyone sue google for it? The problem is though, if you would sue the hoster or webmaster instead you might run into problems of websites being hosted in certain countries with almost no jurisdiction on this area or countries that simply don't give a damn.
No wonder I couldn't find Lloyd C. Blankfein SS on Google...I guess I'll have to try finding it via Bing
Ave Molech Setting
What is their real policy on the issue?
I thought it was obvious simply because they're a publicly-traded company: Protect their own asses first. If Google could be subjected to substantially negative press, delist the site. Rationalizations come later in the form of policies, laws, rules, and procedures.
#fuckbeta #iamslashdot #dicemustdie
I'm pretty much convinced they delist more than the author suspects. Many court orders have gone out in cases where the resolution was sealed and I would expect those related to internet postings could be buried the same way. Of course nothing stops the listing of material faster than leaving it out for all to see and having aggrieved parties (direct or indirect) going after hosting sites if not the actual people who generated the offensive content. Still much of this has to do with what side of you political spectrum you are in.
As in I find it amusing authors examples of questionable photos/links about people associated with conservatives while using the most obnoxious example in regards to the current Administration. I am quite many can remember the all similarly racist and hate based pictures of Condoleeza Rice or Colin Powell. Yet where was the outrage? I guess its OK if one side improperly credited with doing the most for minorities in turn is most likely to turn a blind eye to those minorities if they leave the "plantation".
Nah, deification of elected officials is dangerous and now I bet any picture which distorts Obama or members of his families appearance is automatically sacrilegious.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
I know of one web site that was un-indexed for over a year because the site allegedly hosted illegal activity or encouraged it.
The site categorically did not host illegal content. In fact it used both technical and human screening to keep illegal activity off the site.
Whether it encouraged illegal activity or not is a matter of opinion. It categorically prohibited blatant or even *wink wink* encouragement of illegal activity. However the nature of the site was such that Google might have thought it's very existence encouraged illegal activity. If so, that would be Google's opinion.
If Google wants to remove all web sites that they think encourage illegal activity that is their prerogative but to single out one while leaving many similar web sites as well as web sites that were equally encouraging different illegal activities is playing favorites. Playing favorites when you are the biggest web search engine on the planet is being evil.
Are you his mom or something? Got big plans for the basement once he moves out?
Do you even lift?
These aren't the 'roids you're looking for.
I'm waiting for the Google Labs option that automatically filters out the "direct download" sites that don't actually offer any added value, things like "freewareseeker.com" and "findyourdownload.net". You can drop individual search results, but where's the "never show me this domain or any other domain from this company ever again" button?
What are you doing in their bathroom?
The first point that needs to be thought about is the U.S Privacy laws regarding Health/Medical Records. There is absolutely no reason for any pages from those two topics to be in the search results, particularly as Google is a United States Corporation. Means they can be sued/fined heavily under HIPPIA for violations.
Another is the censorship issue in general. I'll agree that I don't like the Idea of them Caving in to China's demands but only the People of China have any say in their governments decision unless you are willing to declare war and attempt to enforce those requirements upon them by force of arms.
In regards to the Chester Guide, I'm open to debate on whether the page should have been removed from the index or simply gotten the disclaimer? It's important to note that Censorship of any kind is the beginning of a very slippery slope and who's to say that Google hasn't already started the long slide into irrevelency by caving in to both China and Germany's demands and that's the bigger issue. Google has stated that they want to make all known information available but if they're censoring pages at the request of governments, who's to say they aren't censoring pages that governments have not requested? On the China and German Censorhip issues, keep in mind that the censorship only applies within the country that asked for it. Outside still gets access to it. This means the information is only censored on a regional level instead of worldwide as happened with the "Chester Guide".
Mod me up/Mod me down: I wont frown as I've no crown
Google is a business. It is giving users a service (useful search results), and selling your eyeballs to advertisers (customers). I have no problem with that.
If I searched for "Chester's Guide" because I was planning a trip to England and got a link to (even in-jest) pedophilia, that's not a search result that I would be looking for - it's a failure for Google's search engine. Frankly, if I were Google, I would want people to tell me when they think my search results weren't working well, so I could update my algorithms to serve the users better so I could get more money from the customers.
This doesn't need to involve blacklists - all it requires is Google rejiggering it's algorithms to move more relevant links higher in the returned results, and less relevant links lower. They must do that on a regular basis anyway - heck they already (claim to) do it in cases of detected SEO abuse. Now, if its the case that a book on Pedophilia is more relevant given the search terms than a guide to a city in England, not only is Western Civilization in serious jeopardy, a certain city in England has its own issues of irrelevance. /frank
And the worms ate into his brain.
The main difference here is that listing people's social security numbers, credit card numbers, etc. make Google a one-stop source for identity theft on a large scale. Indeed, there's almost no other use for these types of personally identifying numbers. On the other hand, millions of people do want their addresses published in order to conduct business or maintain correspondence, and millions of other people want to find those same addresses for perfectly legitimate reasons. Since Google can't economically deal with these issues on a case-by-case basis, their policy seems like the only option they have.
What the hell? This is Slashdot! If I wanted to read an article then I would have gone somewhere else!
From TFA: "All of this punditry is starting from a premise that's wrong. Google has actually removed pages from their search results -- not because the pages were illegal or because the webmasters were search engine spamming, but because of the page's "offensive" content."
Search on certain child porn terms - and you'll find that Google does censor pages and remove them from their index because of illegal content. They even serve you up a link to the third party that requested the page be taken down because of illegal content.
I think that at least the latter half of this post is missing one huge piece of information. How can you be sure that it is Google that is defining public/private data. A lot of organizations are required to follow federal standards which in themselves define what is public and private. Social Security Numbers and Credit Card Numbers are always at the top of every one of those lists. For reference look at PCI DSS (http://en.wikipedia.org/wiki/PCI_DSS), FERPA (http://www.ed.gov/policy/gen/guid/fpco/ferpa/index.html), and HIPPA (http://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act). Now while FERPA and HIPPA might not have anything talking about SS#s and CC#s take this excerpt from FERPA: "Schools may disclose, without consent, "directory" information such as a student's name, address, telephone number, date and place of birth, honors and awards, and dates of attendance." If a school can publicly disclose addresses without consent which they collect mandatorily, why can't Google, who is just indexing information that was already public not link to it? The problem with the private data online isn't Google, they are just the biggest target that happens to have been hit by this guy's particular wild shooting.
Perhaps you have a short attention span.
Or perhaps he's just not a very good writer. I got several paragraphs in, and had no idea where he was going or what point he was trying to make. He presented some interesting facts, then started to recap them, and showed no signs of drawing any conclusions, so I skimmed ahead a little, and it still didn't seem like he was adding anything more. It was interesting till it started to ramble aimlessly, at which point, I gave up. I read extensively, and polish off several books a week usually. I read Knuth for fun. I plowed my way through Stephenson's Baroque Trilogy, and even managed to enjoy parts of it. I don't think anyone could accuse me of having a short attention span. But I couldn't finish this--or, more precisely, I had no interest in finishing this. The topic seemed interesting, but if he had a conclusion, it was lost in the noise.
The man seems to be in love with his own words, but, unfortunately, has no idea who his audience is. If he's writing for anyone but himself, it doesn't show.
"Google probably de-lists anybody who doesn't offer to install a google toolbar or other similar junk(ad)ware like Earth, or Chrome." Maybe he did, but who could tell in that mess? I get as little as five results on searching for common errors messages. Google is not a search engine in the way most people think it is. Guess I gotta send out my own droids if I want it done right.
For justice, we must go to Don Corleone
Uhh... why? I thought that google was this great big evil company that made me lots of money. But this thing just pointed makes me want to sell my share in google.
And not kike? Or for that matter, nigger, faggot and chink. None of these hold that warning and yet they're more likely to produce "distributing" results.
Google may not be evil, but they /are/ a corporation, which means they are primarily concerned with their own persistence. If you actually want an index engine that "just reflects reality, [and doesn't] create it," I would suggest using a more open search plugin, such as majestic
And BTW, kdawson, well written article. I'd like more of these.
6th Street Radio @ddombrowsky
"only the People of China have any say in their governments decision " Ha Ha Ha :-)
I think you got that wrong. China is a one party state with a habit of putting dissenters in prison.
Save your, "it's up to the people of China" for some other chump.
thanks for summing it up I couldn't finish it either.t up
makes a change for the summary to be in the comments
Twitter, thanks for responding. You are exactly the person who should have responded so that I can make another point.
You also are obviously a very intelligent person. But you don't use your intelligence well, or carefully enough.
Bennett Haselton is an example of someone who uses intelligence sensibly, for the good of everyone. Even if he doesn't yet have all the answers, he is helpful in my thinking.
To answer your point: Maybe I vaguely remember the name Bennett Haselton, possibly from his previous article on Slashdot, linked in this story. Otherwise, I didn't remember he existed until I read this Slashdot story and re-read the linked story.
Google has some very tough public relations problems. Fixing those problems will require serious original thinking. Bennett Haselton is beginning that thinking.
I've thought about the issues myself. I wrote to Google CEO Eric Schmidt expressing some opinions about the challenges Google faces. There was no reply.
In case you don't know Twitter: He has several Slashdot accounts, including the one he used in the parent comment. Here is a Slashdot post he wrote, the first one I found in a Google search: Nothing massive here. Below that is a discussion about him. (The "Twitter" in that Slashdot story title is a different Twitter.)
I don't know Twitter, either, except from reading about him on Slashdot.
Twitter, your anger comes from childhood. Stop acting it out toward everyone. That just makes you more conflicted.
First off, home addresses and phone numbers never used to be a private matter. Everyone was always in the phone book, student directories with phone numbers AND addresses were passed out to all the students every year. IF someone had an unlisted number, it seemed to be noteworthy for some reason. Of course, I'm talking 20+ years ago. Now, people seem to be much more cautious about having their home addresses and phone numbers listed. Of course, now that you can be targeted for prank calls by anyone on the internet... perhaps hiding this information seems to make more sense.
As for medical information, how did that end up on the searchable internet to begin with? Hospitals don't tend to create public webpages detailing the medical conditions of their patients, complete with real names. About the only way news of her extreme toenail fungus would end up on the internet is if she were blogging about it.... or telling friends about it, who in turn feel the need to discuss it in front of a world audience.
As the post made clear, if you want something to disappear, the quickest way to do so is to STOP TALKING ABOUT IT. Nothing stirs up popularity in the age of the internet more quickly than someone complaining about, and then posting a link to, offensive content.
Also, while Google can pretty much do whatever they want as far as delisting or rank adjusting, it's not in their best interests to censor information just because it's mildly offensive to someone, as it provides precedent and opens them up to potential lawsuits when they don't... or do... Common carrier defense and all that. However, in the
age of pedophile witch-hunts, they can pretty safely de-link something of that nature without getting anyone too upset about it. Nobody is going to mount a strong opposition to the removal of that type of material, and anyone who supported it has no fight once it has been removed, so nobody talks about it. No talking, no linking, and therefore no Googling.
-Restil
Play with my webcams and lights here
Google may have delisted a few pages. I wonder if it happens in situations where they could have been concerned about (or maybe faced) legal action. Note that the Chester example is a person in the UK, about whom were written harmful things that are untrue. I believe the UK has libel laws that strongly favor the complainant.
In many cases what Google has done is updated their algorithm. This is not the same as delisting, as the content is still findable. For instance it was not long before the monkey image of Michelle Obama was no longer on the first page for a GIS of "Michelle Obama." However if you searched "Michelle Obama monkey," it was the very first result. From the point of view of Google, this is probably an improvement to their product. IIRC when they defused the "miserable failure" Googlebomb of George W. Bush, many Googlebombs were shuffled out of the top spots as well.
Google says their mission is to organize the world's information and make it findable. My guess is that they are firmly on the side of "search represents general relevance" rather than "search reflects online popularity at that moment in time." I think people too easily fall into thinking about how Google works, rather than what its ideal results should be. If I opened a history book 30 years from now and looked up "Michelle Obama" in the index, it would not make sense for that monkey image to be the illustration.
Build a man a fire, he's warm for one night. Set him on fire, and he's warm for the rest of his life.
Nah, deification of elected officials is dangerous and now I bet any picture which distorts Obama or members of his families appearance is automatically sacrilegious.
The article writer, I think, was even-handed; mentioning a specific instance about Michelle Obama, and contrasting it with different handling of images of the 'opposing' political spectrum. That's reasonable. Now, the way Google and others reacted to the actual pictures isn't reasonable... how Michelle Obama's caricature was handled is a sad indication of a twisted form of racism; but it's a culturally approved racism, so they probably didn't have much choice. It'll be embarrassing a few decades from now, though.
Who was Aramark's mob connected CEO during the 1996 Summer Olympics in Atlanta. In spite of Aramark taking out back cover ads in Time Magazine that summer there is no trace on the internet of him ever being associated with Aramark or the Olympics. That's one example of Google's mysterious delisting practices and The Way Back Machine's as well.
Gee, I always assumed Google simply ranked 1st page search results depending on who paid them the most, with people that COST them money simply disappearing.
They do otherwise?
deification (d-f-kshn, d-) n. 1. a. The act or process of deifying. b. The condition of being deified. 2. One that embodies the qualities of a god. deification != defecation.
You can go to Chilling Effects and read up on many, many cases of Google censorship. The bottom line is that Google gets something like 1000 C&D notices per week, and they just can't afford to fight them, so in most cases, Google just immediately rolls over and complies.
I am curious why the submitter of this article did not include a link to his own search engine, that works as well as Google does but does not abide by any laws and actively breaks them as he suggests search engines should do.
I'd definitely use it for the few hours it was in operation before the owner was hauled through court and the servers confiscated...
Bennet has an interesting article?
and you will see the real search results on the experts-exchange page
this situation has evolved over time i noticed
experts-exchange used to hide the real results a few years ago
then they seemed to be delisted from google results
then they came back to google results, and all you had to do was scroll all the way down to the bottom of the page to see the genuine text you searched for and the entire unobfuscated threaded discussion
so there's a story behind that. what i don't know, but i would guess google saw the scamminess of what they were doing, punished their results ranking, and experts-exchange belatedly "corrected" their obfuscations, and google gave them back their ranking
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
I'm not even sure what he's suggesting.
I own a site that had some old IRC chat stats. Somewhere along the line, Google's spiders reached it and indexed it. Not a big problem, there was nothing really suggestive or problematic... ... except that if you knew certain things about me, and certain things about another online alias of mine, you could piece the two together and discover quite a bit about the real me.
I realized the problem, submitted it to be delisted. I checked back a few weeks later and it was properly delisted, and I haven't seen the page on a search since.
Is that not how it works?
I think you are just not interested in the subject. He's not writing to entertain you. He is writing about very complicated social problems. His thinking is in progress.
Someone must think about the issues. The investigation will certainly continue to be messy. Only when the solutions are found will everything seem clear.
If you think you can do better, please write your ideas here.
Imagine Google's dilemma when someone asks them to delist a page from a Google group or a Blogger-hosted blog or a Google-Pages site.
Or a popular page carrying lucrative AdSense ads on that page ?
Google rather happily hand over the identity of the page owner to any complainants. I know at least one case in which Google has handed over the IP address of a blogger to Indian Police. And the blogger was not based in India. Google may have their own reasons, after all they are a commercial company.
Someone suggested algorithm changes to bring the "relevant" links on top.
Imagine a company X which produces really bad products/services. Naturally company X gets bad reviews from consumers. They complain to Google and Google says, we just search the web.
Now company X gets smarter and set up blogs and web sites and PR release postings to various sites (astroturf) full of praise for their products and services.
Presto... these sites get higher up in the Google's ranking. Company X likes it, but the result is bad for the "public good".
Isn't this realistic ?
Google's ranking has no way of discovering the "truth". But the problem is they just can't !
If you say Google should change the algorithms, there is no algorithm out there which can follow a conversation and figure out whether it is relevant, or has any merit. Isn't that what Turing's test was about ?
or members of his families appearance is automatically sacrilegious.
Only the black side, not the white side of course.
Posts pointing out truths the left likes: modded up.
Posts pointing out truths the left doesn't like: modded down.
Why?
Gamingmuseum.com: Give your 3D accelerator a rest.