Seven Words You Can't Say On Google Instant
theodp writes "Back in 1972, Georgle Carlin gave us the Seven Words You Can Never Say on Television. Thirty-eight years later, Valleywag reports on The Definitive List of Words Google Thinks Are Naughty. You've probably noticed how the new Google Instant tries to guess what you're searching for while you type — unless it thinks your search is dirty, in which case you'll be forced to actually press ENTER to see your results. Leave it to the enterprising folks at 2600 to compile an exhaustive list of words and phrases Google Instant won't auto-search for."
white Power (but not "Black Power" - it's all in the marketing, after all)
I read the list. I was expecting words that usually mean something everyday but have broadened to include potential offensive material. Amateur for example.
What surprises me is the list includes words where the definition would have to be known, and the person consciously wants to find the subject matter. a2m for example.
But its broader. A few choice ones on the list: fecal(legitmate medical/anotomical usage), lesBian, and finally, redtube gets the censor treatment.
I like the comment next to "cucold" - this one dates back to 1250, but it dies here.
And google has the gaul to climb on a soap box about censorship, the great wall filters of Australia etc.
In post Patriot Act America, the library books scan you.
Could it be that this system blacklists the words based on the content to be displayed and not based on the input itself?
The above-mention "a2m" could easily be a part of a serial code I'm entering, and I appreciate google's assuming that, if I want potentially embarrassing content, I can be bothered to press enter.
I also don't want to become sexually aroused during work, and appreciate this rare display of understanding of human nature.
I doubt the second - very processor intensive.
However, I propose a third option, that the blacklist is automatically maintained.
That is, they classify web-pages: offensive, Y/N? And then their index automatically tags terms strongly associated with offensive web-pages, which are automatically blacklisted. This is how you'd get "white power" (present on many offensive webpages), but not "black power" (present mainly in scholarly articles, let's be blunt). This is why you'd get "futanari" and not "hermaphrodite", this is why "schoolgirl" is offensive, etc.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
Some years ago, I wrote an Internet chat system for a major Australian bank (which bank? No comment). Ok, innovative enough at the time, but not too exciting.
But here's the interesting bit - they sent me a list of words they considered offensive. I had to write a special scanner to handle this - the most challenging part being dick. I was supposed to reject "dick", but accept "dick smith" [which is a major Australia techie shop, equivalent to Tandy or Radio Shack, perhaps] .
So anyway, I was left in possession of a list of words banks don't like. Maybe I should publish it.
"Cats like plain crisps"
I bet if you asked the Puritan's wives, you'd get a different story.
Anyway, C.S. Lewis is not known for truth-telling so much as comforting fairy tales, and yes I'm referring to his non-fiction essays.
You are welcome on my lawn.