Poor Spelling Beats Google's China Filter
antifoidulus writes "CNN's money section contains a blurb(among other blurbs) about how poor spelling can beat Google's Chinese filter. The example given in the article is that a search for "Tiananmen" will yield peaceful pictures of the square, but a search for common mis-spellings such as "Tienanmen" will yield plenty of photos of tanks."
I are a gud spelr!
The simple truth is that interstellar distances will not fit into the human imagination
- Douglas Adams
that not everything can be filtered but this is a search using english alphabets. How good (read horrible) is the filter which searches using chinese langauge ?
They called me mad, and I called them mad, and damn them, they outvoted me. -Nathaniel Lee
This gives me an idea of how I can get past Bush and Co. monitoring my internet usage. I'll be able to say with a straight face that I never searched for Porn, but rather I was hoping to find information about shellfish
So, the Chinese can successfully access slashdot?
...as a Leader of the Revolution.
Kind of reminds me of when Napster installed that half-assed search filter. Midonna and Mitallica suddenly became quite popular.
People who want to get information will get it, and you can't stop them.
As we all know, Google has a patented page ranking system that calculates the correlation of words with websites. It does this (primarily) by reading links from all of its cached websites and parsing html links to determine what words are being used to describe the page in the link.
A while back, this was known as Google Bombing and certain individuals exploited Google's system very effectively by linking to pages with words that, by all rights, were not very accurate. After all, do a Google search for the word 'failure' and the top site is George W. Bush's Whitehouse domain Biography.
So what do you do to help the Chinese? Perhaps you could make a page with two columns. In one column would be the correct text with no link and the key word. In the other column would be all the permutated misspellings with links to the real sites. You could host this one your website and send it to friends asking them to also host it. They would need to slightly alter it and host it but it would effectively provide the page ranks for the misspellings and allow anyone in China (who has access to your page) a key if they need it.
My work here is dung.
This is a perfect example of why I've been saying all along that google is making the right decision in cooperating with the Chinese Government: http://yro.slashdot.org/comments.pl?sid=175251&cid =14571383
Is'nt it somewhat obvious that a word filter would not filter out words with no 'evil' meaning? I mean, why would google want to block people from searching fords like "deedom" if word "freedom" was to be blocked?
Now was this simply a failure of the filter method used, or did google deliberately create a weak filter to subvert the effort?
...search for common mis-spellings such as "Tienanmen" will yield plenty of photos of tanks.
So I did a Google search and all those pictures of tanks are basically one photo hosted on different sites.
Thanks for your feedback. We will endeavour to respond to your bug report as soon as possible, and release an update if appropriate. Sincerely, Google information liberation management team Google Inc. "Do no evil."
Who would have thought a thechnique spammers use to beat filters would have real-world value.
Is Google's filter Baysian based?
Ignorance is curable, stupid is forever.
So.. Chinese people speaking the same broken Engrish on the Internet as they typically do elsewhere beats the Great Firewall of China.
Engrish in the spirit of Freedom!
--- We need more Ron Paul!
It would probably be better to *NOT* point these things out.
LSA is useful for dealing with synonyms, so I cannot see any reason why it wouldn't work with misspellings (assuming that they're common).
bang goes my karma... again...
People whining about Google's actions with respect to China fail to realize that the alternatives (even more dreadful Chinese filtering, Google being banned entirely, etcc) are worse alternatives for Chinese freedom.
...and so the weakness of computers is revealed: people and their presumption of perfection.
Sig? - yeah, whatever.
I could do some silly jurassic park quote : "Life finds a way" or something equally wise and witty, but all that's needed to be said is what's in the subject. The very concept of trying to control information on the medium of the internet is like the perpetual motion machine. Nice idea, great money sink, but utterly impossible to implement. The only way for this kinda censorship to work is for all users to agree to abide by it's rules (including correct spelling), which isn't gonna happen I think in the cases mentioned in the article. Just like file sharing, it's still possible to steal music, but most people don't bother anymore because it can be downloaded legally and as a group we generally agree to abide by that rule. Once we didn't but now we do with the existence of legal alternatives.
The rock, the vulture, and the chain
Friedums just anoder werd for nuthin lef 2 looze, and nuthin aint werth nuthin but it's Frie.
:(
I'm so dam Ronery
He who knows best knows how little he knows. - Thomas Jefferson
Of course, why do people think l33t sp33k was invented in the first place?
READY.
#
...is a big hit around the world
Thanks for your feedback. We will endeavour to respond to your bug report as soon as possible, and release an update if appropriate.
Sincerely,
Google information liberation management team
Google Inc. "Do no evil."
"The more you try to out-think the plumbing, the easier it is to stop up the drain." - Cmdr. M. Scott
If there is one thing that many of us have learned over the course of our internet-connected lives is the simple fact that there is a work-around for EVERYTHING.
There has yet to be a copy protection scheme that hasn't been defeated. There is no internet filter that can't be bypassed, and no blocking that can't be dodged.
What the Chinese need to learn is that their efforts are as futile as attacking a funny farm with a banana. Someone will be able to find a way around the blocking and will get to information that the government wants them to get to.
Someone needs to wake them up with a clue-by-four and explain how the real world works
-- Wiccan Army, 13th Airborne Division "We will not fly silently into the night"
Only two mistakes. The "I" should not be capitalized and the "a" should be replaced with the word "sum" - Othur then thees purfict spallin duude!.
I guess this workaround will be quietly blocked at some stage ... until the next workaround emerges. Google are in too deep now, though. Their China venture is a whopping mistake, imho. The company whose business pitch is that we should trust it with the world's information falls at the first hurdle by showing it cannot be trusted with even a part of the world's information if the bribe is large enough.
Las qué passoun
tournoun pas maï
It was good while it lasted.
Free Software: Like love, it grows best when given away.
I think AOLers are quite safe from Chinese censorship.
I was going to reply with something along the lines of a resounding "DUH!!!" (remember the last days of Napster?), but Taco's from the see-thats-why-i-misspell-stuff dept. made me laugh out loud and forgot what I wanted to say. Well done :)
ClutterMe.com - easiest site creation on the Net. Just click and type.
...when he said something to the effect:
"The more you overtake the plumbing, the easier it is to clog the drain."
China has a Maginot-Line mentality, and their censorship efforts will eventually fail just a miserably.
(ST flames and corrections, and French jokes, may commence now.)
Cloned foods give the statement "We had that last week!" a whole new meaning.
Those in China can easily use the uncensored and unblocked www.google.com.hk or www.google.com.tw
I like that idea of the deliberately misspelled words. Once the Chinese dissidents find out they should be searching for D3M0C@CY and HUM@N R16HTS, the censors will be a step or two behind.
This sig, aah-ah, is comin' like a ghost-sig...
Why the Chinese link didn't work for me. I could see everything that was suppose to be block. I see the reason now is because I am the worlds worst speller.
Star Trek, there maybe hope.
SHUT UP!
Do you want to ruin it?
Come on, damnit! Shutupabout it.
Consider this the "getting your foot kicked under the table" move.
Check out my sysadmin blog!
I guess that means they can still find ./ huh?
They're filtering English mispellings, but what about French, Spanish, or German? A Chinese person could just search for what they're looking for under different languages. Granted, English is taught in China in their schools to everyone, but the folks who know other languages can start getting things and spreading it to the others.
I think they do, I just don't think it corrects it in your search results (When it says "Did you mean" and then what it thinks is the correct spelling). I know there are plenty of times I have actually wanted to search for something and it kept suggesting the wrong thing! - nosebreaker.com
A) Google guesses what you are trying to spell, and does it very well.
B) This is an oversight that would be easily corrected.
C) You just announced it publically and unignorably.
D) Most of the people censored don't spell it with latin characters anyway.
As much as many of you would like to think that Google "slipped this in" on purpose I have news. Google announced they shall do business in China, and will do whatever it takes to do so.
This is no intentional 'hack' of the system. It's a new content filter and there's going to be holes to be patched and creative solutions to be found for creative problems.
So before you go hail the Google dev team as being revolutionary, maybe you should consider they just missed the mark the first time around and have a lot of clean up to do with this "feature."
If you're half as beautiful naked, you'd be 4 times as beautiful with twice as many clothes on.
The following two searches both lead to images of tanks.
l r=&q=Tiananmen&btnG=%E6%90%9C%E7%B4%A2
l r=&q=Tienanmen&btnG=%E6%90%9C%E7%B4%A2
http://images.google.cn/images?svnum=10&hl=zh-CN&
http://images.google.cn/images?svnum=10&hl=zh-CN&
It appears that they are not filtered out at all, regardless of the correctness of the spelling.
-Eric
SJW: Someone who has run out of real oppression, and has to fake it.
why'd yall have to go and blab about this? don't you think the people who most benefit from this loophole could learn by word of mouth? No the chinese govt knows to go beat up Google. can't you just see the RFQ: "prease submit bid to peopers minsitry of truth. We seek bids and proposars for sperring checker prug-ins and key roggers"
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
Hm. Most Chinese wouldn't search for a romanized string "Tiananmen". They would search using Chinese characters. This sort of off-by-one "misspelling" is rare when you have to pick among actual pictographs. Second, assume that someone would search in pinyin, it's not like "Tiananmen" is actually one word. The first instinct for a search in pin yin would be "Tian An Men", as they are in fact three characters being mapped to. Also kind of difficult to misspell, as these are atomic sounds in pinyin or very simple, two-sound combinations taught from first grade up, and "en" and "an" are seriously different sounds mapping to two very different set of pictographs.
To hit a misspelling like Tianenmen, you'd have to be thinking in an English-oriented mindset. Heh, or have prior knowledge of this little issue and are deliberately out looking for trouble.
Erotic image here.
Nice lighting.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Writing of the damaged Engrish is inducement of Great Firewall failure! China is go for many information of the "freedom" by using spelling of the internet Engrish! 31i73 breaking of the language, for great justice!!
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
It looks like slashdot will always be visible in China. :-)
http://saveie6.com/
So I am curious now. If people run around and make alot of noise about how to defeat the censorship, won't the Chinese Government demand a fix to close those holes or send Google packing? This sorta puts the whole thing back at square one. Normally I would say full disclosure of security and things like this is a good thing, but in this case it may not be the best. Sort of a Google 0-day situation, this isn't something you want the authorities to know about, and something you don't want fixed for as long as possible.
The only change I can believe in is what I find in my couch cushions.
Thanks for blowing it for the Chinese...putting a link to some backwater news site on the front page of Slashdot.
On a more serious note, couldn't people who are not in China put up a little proxy to return Google results? For example, I have a domain hosting a few pages. Could I put a little script to take a query entered at my site and return results obtained from Google?
"Me fail English? That's unpossible." - Ralph Wiggum
Do a search for Falun Gong using both the regular Google and google.cn. The China version not only censors the results, but pushes propoganda to the top of the search results. "Don't be evil" indeed. I absolutely love Google and won't be switching anytime soon. But maybe for the China version they should just change the "G" in Google to a hammer and sickle.
The combination should be quite amusing and effective...
Oh well, what the hell...
... so in Communist China, I can safely get my jollies by typing "pl0n"?
When talking to a colleague back in 1989, they revealed that even back then there was a KFC (Kentucky Fried Chicken) in Tiananmen Square. It was never seen on any of the news footage, though.
Environmentalism is the new Victorianism. Everyone ties on a green corset and pretends we're virtuous.
Chinese web users can see full, uncensored results for their Google search by replacing "&meta=" with "&meta=cr%3DcountryBR" in the URL. Once the string is replaced, the censorship will not affect the results.
c hina&btnG=%E6%90%9C%E7%B4%A2&meta=cr%3DcountryBR
This is what a chinese search for Democracy looks like after this method has been applied:
http://www.google.cn/search?hl=zh-CN&q=democracy+
looks like its not working anymore. by this i mean a search for the correct spelling on google.cn turns up tanks a plenty. what gives?
Am I the only one thinking "why are we adveritising this so they modify their filters and improve them"? That's great that people are finding ways around the filters... but maybe keep that on the down low??
Just as in English, where it is common to use "misspellings" to write things quickly, such as "u", "luv", etc, it is also common in Chinese to use homonyms that are easier to write as a "shorthand" for more complex characters. I have seen people do this when writing quick notes, especially waitresses.
Imagine, if you will, the reverse process. I have noticed that others can't get Chinese characters to display properly here, so I won't try. However, the characters for Tiananmen are all relatively simple characters. Perhaps people, for the benefit of our Chinese fellow net-users, could use more complex homonyms (keeping the tones the same for maximum effect / ease to remember) tian1 (1st tone) has two homonyms listed in my dictionary, one meaning light-yellow and the other meaning oppose.
an1 has twelve, two of which are simpler and ten of which are more complex than the "correct" character in this case.
men2 has 5.
If you add syllables with different tones you significantly increase the possibilities.
Changes in capitalization also work, for now.
Look... as much grief as Google is getting for this, they know hackers are going to get past the wall. The Great Fire Wall of China will work about as well as the original did. It's there to make a point and it's not going to stop anyone.
Google has been getting all the news on this, for first not applying filters, then capitulating. However, the other major search engines all agreed to apply search filters for China right off the bat, and got no press at all.
What I wonder, and am too lazy to figure out for myself, is whether the "misspelling" workaround is functional not just for Google China, but for Yahoo China, MSN China, etc. I suspect it is.
Web 2.0 == Giant Blogspam Circle Jerk
Google, this is unbelievably disappointing. You lost a lot of grassroots support when you decided to support the suppression of freedom. How does it feel to be a participant in a communist government, and how does it feel to be fighting against the mindset that put you where you are today? A couple of college nerds, one from Russia, now billionaires. You're not fooling anyone - you did this to cash in on an emerging market. This situation makes me sick.
Deal with this, google.cn!
---
Wants pawn term, dare worsted ladle gull hoe lift wetter murder inner ladle cordage, honor itch offer lodge dock florist. Disk ladle gull orphan worry ladle cluck wetter putty ladle rat hut, an fur disk raisin pimple colder Ladle Rat Rotten Hut.
Wan moaning, Rat Rotten Hut's murder colder inset, "Ladle Rat Rotten Hut, heresy ladle basking winsome burden barter an shirker cockles. Tick disk ladle basking tutor cordage offer groin-murder hoe lifts honor udder site offer florist. Shaker lake! Dun stopper laundry wrote! An yonder nor sorghum-stenches, dun stopper torque wet strainers!"
"Hoe-cake, murder," resplendent Ladle Rat Rotten Hut, an tickle ladle basking an stuttered oft. Honor wrote tutor cordage offer groin-murder, Ladle Rat Rotten Hut mitten anomalous woof. "Wail, wail, wail!" set disk wicket woof, "Evanescent Ladle Rat Rotten Hut! Wares are putty ladle gull goring wizard ladle basking?"
"Armor goring tumor groin-murder's," reprisal ladle gull. "Grammar's seeking bet. Armor ticking arson burden barter an shirker cockles."
"O hoe! Heifer blessing woke," setter wicket woof, butter taught tomb shelf, "Oil tickle shirt court tutor cordage offer groin-murder. Oil ketchup wetter letter, an den - O bore!"
Soda wicket woof tucker shirt court, an whinney retched a cordage offer groin-murder, picked inner widow, an sore debtor pore oil worming worse lion inner bet. Inner flesh, disk abdominal woof lipped honor bet an at a rope. Den knee poled honor groin-murder's nut cup an gnat-gun, any curdled dope inner bet.
Inner ladle wile, Ladle Rat Rotten Hut a raft attar cordage, an ranker dough belle. "Comb ink, sweat hard," setter wicket woof, disgracing is verse. Ladle Rat Rotten Hut entity bet rum an stud buyer groin-murder's bet.
"O Grammar!" crater ladle gull, "Wood bag icer gut! A nervous sausage bag ice!"
"Battered lucky chew whiff, doling," whiskered disk ratchet woof, wetter wicket small.
"O Grammar, water bag noise! A nervous sore suture anomolous prognosis!"
"Battered small your whiff," insert a woof, ants mouse worse waddling.
"O Grammar, water bag mousy gut! A nervous sore suture bag mouse!"
Daze worry on-forger-nut gulls lest warts. Oil offer sodden, thoroughing offer carvers an sprinkling otter bet, disk curl and bloat-thursday woof ceased pore Ladle Rat Rotten Hut an garbled erupt.
Mural: Yonder nor sorghum stenches shut ladle gulls stopper torque wet strainers.
If you disagree with me on social issues, then it's pretty clear that you are a narrow-minded bigot.
1n Ch1n3s3, a s1ngl3 charact3r ( f0r 3xampl3 -- th0ugh 1'm n0t sur3 1f
th1s w1ll d1splay pr0p3rly) r3pr3s3nts a wh0l3 syllabl3 (as w3ll as a
m3an1ng 0r 1d3a), rath3r than a c0ns0nant 0r v0w3l, as m0st 3ngl1sh
l3tt3rs d0 (s0m3 ar3 unpr0n0unc3d, 0r just chang3 th3 s0und 0f an0th3r
l3tt3r).
Th1s 3l1m1nat3s c3rta1n typ3s 0f bad sp3ll1ngs, 0bv10usly, but 0p3ns
c3rta1n av3nu3s that ar3n't ava1labl3 1n 3ngl1sh, such as ch00s1ng
charact3rs w1th s1m1lar m3an1ngs but d1ff3r3nt s0unds, 0r s1m1lar
s0unds but d1ff3r3nt m3an1ngs.
F0r th3 T1ananm3n 3xampl3, th3 charact3rs f0r T1anAnM3n () m3an
"H3av3n," "P3ac3," "Gat3." H3av3n c0uld b3 r3plac3d w1th "Sky," wh1ch
has a c0mpl3t3ly d1ff3r3nt s0und, 0r "M0n3y," wh1ch (1f 1 rcall
c0rr3ctly) 1s pr0n0unc3d "Q1an" (Q s0unds cl0s3 t0 3ngl1sh CH). Th1s
c0uld als0 happ3n w1th w1th th3 0th3r tw0 charact3rs 1n th1s w0rd, and
0f c0urs3 f0r many 0th3r 'bad' w0rds.
Th3 r3as0n that c0mm0n w0rds l1k3 "pr0n" hav3 b3c0m3 ass0c1at3d w1th
p0rn, 0r 0th3r 3xampl3s, 1s that a c0mmun1ty 0f us3rs agr33d up0n a
c3rta1n m1ssp3ll1ng 0f th0s3 w0rds, and th3 sam3 can and W1LL happ3n
1n Ch1na t0 3vad3 what3v3r f1lt3rs s3arch 3ng1n3s us3. Th3r3 1s n0 way
t0 hav3 an 3v3n s3m1-0p3n s3arch syst3m that d03sn't all0w human
1ng3nu1ty t0 0v3rc0m3 1ts f1lt3rs, and th3 br13f h1st0ry 0f th3
1nt3rn3t 1n th3 w3st 1nd1cat3s that th3s3 f1lt3rs w1ll, ult1mat3ly, b3
0nly part1ally and t3mp0rar1ly 3ff3ct1v3.
If this was going to be so insightful, you'd think I would've gotten a mod up when I posted this the first time.
I found many pictures of tanks the other day, when the news of GIS.cn's censorship was posted on metafilter. Including a few chinese character queries (including tian-an-men tan-ke). One of the things to remember is that the chinese are going to be searching in chinese characters, not english.
Searching for something as simple as "tank man" or "tank square" on GIS.cn will get you the pic you're looking for, btw. As long as you don't include "tiananmen" in the query, you'll get it.
autopr0n is like, down and stuff.
Could bring a whole new meaning to the expression "spelling/grammar nazi" if the Chicoms decide to start rejecting queries with too many non-OED words.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
If you search for "square massacre" you get this lovely image, and then there's this one you get in a search for "china tank".
autopr0n is like, down and stuff.
God Hates Shrimp?
...even spell it wrong, you just need to capitalize the first letter and you go from happy gardens to tanks.
Ryan - http://www.thecosmotron.com/
Now please report to the education center for re-Nedification
-William Shatner can be neither created nor destroyed.
In the back of my mind, I am beginning to think that images are the only way to really get the point across. Images are hard to parse for text, and meaning.
If you could post an image of a tank rolling over someone in China with a good imbedded caption, it might get the point across without alerting the Chinese government.
Next year, it won't work, thanks to antifoidulus.
http://outcampaign.org/
r33l!, th3 b!7 @b0u7 $p3llin9 !$ n0 $urpr!$3 70 31337 h@><0rz....
www.wavefront-av.com
. . . the Internet reroutes around the damage.
];)
Regards;
Thanks for alerting the Google Nazis!
While possible, this tactic may not be a good idea. If you are in China and attempt to repeatedly circumvent the filter, I suspect sooner or later you will get a knock on the door.
Ninjas don't carry tic tacs
Mixing up "tian" and "tien" has a very low probability for native Chinese. The syllable "tien" does not exist in the pinyin romanization of mandarin Chinese. So blocking out the general public is ensured, which is good enough for "them". Searching terms in English or German also produce also quite a number of "useful" sites that are visible through the great firewall.
As more and more Chinese people come online, they're going to wonder why entire websites suddenly disappear for weeks and reappear. They're going to find ways to defeat filters and internet cops to satisfy their curiosity about forbidden knowledge. They're going to find clever ways to encrypt and disguise their communications with each other and the outside world. They're going to remember that information is power. It is inevitable and unstoppable.
Surely I've just missed the post where someone considered the possibility that google did this on purpose. If I were going to try to do no evil yet still allow the chinese people to have access, I would comply very explicitly with the requirements provided with the government, even though you could easily see holes in the implementation. Google has made its name in part by going above and beyond to supply the "correct" results regardless of how inept the user is. Surely they must have thought of this, and I'm pretty sure they will have overlooked this on purpose.
Second , the misspelling list keeps growing, google has to keep updating, which is taking up resources.
Third, China, or maybe even Google just put it a phoenetic parsing bit and ban on phoenetics. Maybe they apply some other ingenious code (something google can and does come up with all the time) to the problem.
Result, a more efficient and adaptive censoring machine.
Face it people, Google has jumped the "evil" shark.
I'm a fiscal conservative, it's a pity we don't have a political party anymore
Didn't people in WoW reject chinese players who couldn't spell sentences correctly?
This reminds me of the phrase: "Your famine is my feast".
Enuf sed
When the people fear their government, there is tyranny; when the government fears the people, there is liberty.
Example: Teeanamen Skware.
An incorrect spelling like that gets published, say HERE, and is noted by some Chinese equivalent of Winston Smith in the Chinese Minitrue, and its passed over to the directorate for inclusion on words to ban. Eventually you run out of room to run, even if you spell (correctly or otherwise) in variants of 1337.
The only way to HELP the Chinese find this info is to keep it on the QT, as a sub rosa info exchange. Of course, the Big Businesses that own the Major Media are *not* interested in that - they support the fascist pigfuckers in Beijing because they're the ones supporting our idiotic adventures in Babylonia by buying our debt, they're the ones who are keeping the rapacious maw of the town busting barns of WalMart in stock with cheap goods, and they're the ones who are most interested in watching the USA drive right off the energy / debt cliff.
We have to be clear: the Fascist Chinese
( calling them communist is an insult to the memory of the likes of Rosa Luxembourg, Karl Leibknecht, Friedrich Engels, Karl Marx, Georg Lukacs, Adorno, Benjamin, etc. and all the other great left wings thinkers of our modern era, just as calling a fascist like G.W. Bush and DICK Cheney "conservatives" is an insult to the memory of great conservative thinkers like Hayek, Hamilton, Burke, etc. )
will do whatever it takes to stay in power, including killing innocent people. The greatest political threat to humanity in this world today is this fascism - so deeply entrenched in China, and flowering so madly in the TV addled American Middle Class. We, as the "intelligent" bunch, need to be much SMARTER in how we deal with these fascist pigfuckers. And providing inverse roadmaps for greater repression is NOT a way to help them. We need to be quieter about our solutions and louder in our criticism. Oil production is peaking, and it's going to be a fight to the death for the rest of this century over what remains. Tthe Chinese want more - WAY MORE - and they will cheerfully use the laziness, greediness, and shortsighted stupidity of Americans against Americans in order to direct us to the cliff of self-destruction. The greedy fascist pigfuckers in China .gov AND the Bush Junta must be stopped - the Bush bunch and their big business buddies are too stupid to know they are being played, big time.
RS
Shoes for Industry. Shoes for the Dead.
Well... now we've got a new justification for 1337 speak. Haven't had one since the good ol' days of dialup BBS, but the bastard is back.
Check back tomorrow, and learn that anything will beat slashdot's dupe filter.
-- We don't understand software, and sometimes we don't understand hardware, but we can *see* the blinking lights
This is how child pornografers have evaded filters for years.
If you have a site with some process handling c-code, which puts the word 'child' in a lot of strange places, weird google-seaches will show up as referals in you web logs. Quite creepy actually.
Not even Google can protect you from your own stupidity.
This reminds me of the page on the many permutations of the spelling of "Aargh."
Google so far has been taking the high ground by saying in effect that the Chinese public now has more information than they previously had (an argument I disagree with, BTW, because substitution of content by a major media outlet is thought control, such as Clear Channel's refusal to play John Lennon's Imagine). Now, if Google takes away information it had once granted, it will constitute blatant censorship. The direction of information will be ebbing rather than flowing. And Congress was trying to hold hearings over just the (restricted) flowing part.
Google may now be in a place of having to choose between looking really nasty or losing the business in China.
The Chinese will want it fixed. This should be the "worst kept secret" not news. :-)
Think Deeply.
s/arms/code/
Sounds like a great opensource project.
1. establish a correspondence/permutation table.
1a. Start ECMA fast-track standard to ISO.
1b. auto generate "gahtchya"(TM) images of table entries to foil crawlers, keyword censors and image processing for text. Patent pending on "gahtchya" (TM) synthesis of Chinese character strings for decensorization (TM).
1c. Encode images in DNS records ala DeCSS
1d. Get grant from government agency (DHS? DoD? It should make a twisted kind of sense to some bureaucrat)
1e. Create adSenseless and banner ads (G wont casually block the revenue sources)
1f. Pay for ads with Grant money (unmarked, non-sequential $50s)
2. ???
3. Freed Information!
PS Hey Google, what is the appeals process for incorrectly censored sites?
There is no right to feel safe thru security vaudeville at the expense of everyone's freedom, privacy and tax money.
The notion that "information wants to be free" seems more valid all the time.
sigs, as if you care.
O RRY?
How ya like dat?
How would China stop their citizens from using www.google.com instead of www.china.cn. Or, how about a proxy website? Do the internet providers in China filter out websites they are even allowed to use? Where I work, we use a program called Websense on our servers that does not allow certain websites to come up.
Click Click Bloody Click PANCAKES!
to leave as many leaks as Google will? Neither do I? Nor will their search be as effective. Chinese citizens win both ways.
This seems to be happening at the DNS level. "google.cn" resolves to "216.239.39.99", which is assigned to Google in Mountain View CA. A traceroute doesn't show a path to China at all.
Now, interestingly, if you look up "google.cn" in US Google, and get the cached page, you're really seeing the censored view of Google, in its English language edition.
To try this, go to the cached page above, and enter "falun gong". The top search results are "The Cult of Falun Gong", "Falun Gong Evil and Harmful", "Falun Gong Members Found in Slander Case.", "Heretical Cult -- The True Colors of Falun Gong", and "Outlawing Falun Gong Cult". That's obviously the censored version. The search doesn't come up blank. There's no message about censorship. You get the Official Approved Propaganda Results. It's very Orwelllian. And it's not what Google has been telling the US press.
Now try the same search with US Google. You'll get all the real Falun Gong sites, and the Wikipedia entry.
So that's Google's Ministry of Truth in action. Try it yourself.
Yes, and Both Romes fell, the third endures, and a fourth there will never be , which was a statement of the Orthodox Church, but also was used to support the Ultimate Victory of Communism(TM) for a time.
But skipping back to your closing statement, how do you know that? >p> Let me take it as a statement of your faith that Now That We Have Modern Accomplishments (TM), that good guys will win, and indeed must win.
I like the sounds of that. It reminds me of WWI, the War to End all Wars(TM). Clearly, though, their technology was not as advanced as ours, and so the ultimate human spirit didn't shine through. Or maybe poverty, which was supposed to go away with the New Deal type programs (or with the UN), hadn't quite vanished by then, since Poverty is the Source of All Evil(TM).
But I have seen conflicting statements of faith that just might go against that. For example, certain evangelical Baptists think that China will rise up with a million-man army for a final battle in the Middle East. That doesn't sound like Technology in Service of Humanity(TM) to me.
So exactly which statement of faith should I believe, and why?
Because right now, I'm not convinced that Technology Solves the Problem of Human Evil. Call me a skeptic.
Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
Why do I think that Google is going to be slow to fix this bug?
No, I will not work for your startup
While I am sure there are all kinds of censorships on free speech pages and politics, I don't think this is a valid example for this reason: Tiananmen square is a very famous location in Beijing, whereas people foreign to china have probably only heard of tiananmen square from the protest back in the 80s. So obviously a search on google china would turn up results of pictures of tiananmen square, whereas a search on google USA would turn up images of the protest. For example, a search for "World trade Centers" in google USA would probably give mostly images of the world trade centers, and not images of the 9/11 attacks. Another reason would be that many people in china simply dont know about the event, because they never talk about it. Naturally, there wouldnt be many images of the protest. So yes, it is still an example of censorship, but not exactly specific to the google search engine. my point is that this may just be a problem of context, and doesnt really indicate censorship from google. Maybe a link to a politics site would give more of a indication. i just went and searched ""(Tiananmen square incident, in chinese) on google china and it gave me tons of results for pictures and articles of the protest, much more in depth than google USA actually. Also, all chinese sites referring to the tiananmen square incident will be talking about it in CHINESE, OBVIOUSLY. and the only people that are going to write Tiananmen and spell it wrong, are most likely foreigners, which are far more likely to be talking about the incident, instead of tiananmen as a LOCATION.
Link went to pictures of shellfish!
I didn't just pull 'q = ch' out of my ass, it's the standard-use pinyin.
The reasons they chose to use 'q' are that:
a) it wasn't taken for any other sound, as it doesn't represent a unique sound in English
b) it is recognized as a separate sound by native speakers than the one spelled "ch." They appear in a complimentary distribution, hinting that in the past they were the same sound, and different following vowels affected that sound in different ways. I haven't studied the history of Chinese much, but I do have a degree in Linguistics, so take this however you want
c) it isn't exactly the same as the English "ch," so using that spelling would be confusing. That sound can be spelled many different ways with the roman alphabet, no need to use English as a guide anyway.
In summation, look it up on wikipedia, or read the other posts for some more info.
Although the moon is smaller than the earth, it is farther away.
"CNN reports...how poor spelling can beat Google's Chinese filter."
Dolts. Not for much longer.
google doesn't censor anything themselves, they simply run the chineese webcrawler from within the great firewall. the problem is with the great firewall not being too bright about mispellings, not anything google does. it's not googles job to make sure china gets their software right.
http://notanumber.net/
I can't believe it. But it seems, that "Tiananmen" and "tiananmen" give different results. Try http://images.google.cn/images?q=Tiananmen You'll be surprised. In fact, a Chinese would enter instead of letters. And the servers hosting the pictures are not Chinese. Heipi
I think the Google Goons did it on purpose. It is funny though. About as funny as their April Fools day stuff.