Poor Spelling Beats Google's China Filter
antifoidulus writes "CNN's money section contains a blurb(among other blurbs) about how poor spelling can beat Google's Chinese filter. The example given in the article is that a search for "Tiananmen" will yield peaceful pictures of the square, but a search for common mis-spellings such as "Tienanmen" will yield plenty of photos of tanks."
Kind of reminds me of when Napster installed that half-assed search filter. Midonna and Mitallica suddenly became quite popular.
People who want to get information will get it, and you can't stop them.
This is a perfect example of why I've been saying all along that google is making the right decision in cooperating with the Chinese Government: http://yro.slashdot.org/comments.pl?sid=175251&cid =14571383
Who would have thought a thechnique spammers use to beat filters would have real-world value.
Is Google's filter Baysian based?
Ignorance is curable, stupid is forever.
It would probably be better to *NOT* point these things out.
Can you spell Bukcake? Or Pusy? Or AZZ? Get that by the filters!!!! But seriously, this is where pr0n comes from, the spelling that is, to get by filters...
...and so the weakness of computers is revealed: people and their presumption of perfection.
Sig? - yeah, whatever.
Google has really good suggested search terms for typos. Hint, hint. Skeet, skeet.
A NYC lawyer blogs. http://www.chuangblog.com/
SHUT UP!
Do you want to ruin it?
Come on, damnit! Shutupabout it.
Consider this the "getting your foot kicked under the table" move.
Check out my sysadmin blog!
In Chinese, a single character ( for example -- though I'm not sure if this will display properly) represents a whole syllable (as well as a meaning or idea), rather than a consonant or vowel, as most English letters do (some are unpronounced, or just change the sound of another letter).
This eliminates certain types of bad spellings, obviously, but opens certain avenues that aren't available in English, such as choosing characters with similar meanings but different sounds, or similar sounds but different meanings.
For the Tiananmen example, the characters for TianAnMen () mean "Heaven," "Peace," "Gate." Heaven could be replaced with "Sky," which has a completely different sound, or "Money," which (if I rcall correctly) is pronounced "Qian" (Q sounds close to English CH). This could also happen with with the other two characters in this word, and of course for many other 'bad' words.
The reason that common words like "pr0n" have become associated with porn, or other examples, is that a community of users agreed upon a certain misspelling of those words, and the same can and WILL happen in China to evade whatever filters search engines use. There is no way to have an even semi-open search system that doesn't allow human ingenuity to overcome its filters, and the brief history of the internet in the west indicates that these filters will, ultimately, be only partially and temporarily effective.
Although the moon is smaller than the earth, it is farther away.
They aren't necessarily out to defeat the determined. They can however, quickly and easily sanitize the popular perceptions by sweeping things under the rug. To the average citizen, they do a little search and never see anything particularly shocking. Mission accomplished. And as I said, given time, the determined will eventually get their message across. The Internet just adds another layer to a game that's been going on since the dawn of government.
Am I the only one thinking "why are we adveritising this so they modify their filters and improve them"? That's great that people are finding ways around the filters... but maybe keep that on the down low??
I'd rather see Google grand stand about not bowing to China's governmental pressure to assist in forceful suppression of ideas. Yes, that may get Google banned in China. However, Google is so big and powerful everywhere else in the world that news of its existence and popularity would become known to some curious folks in China who would begin to resent their government for banning it. In that resentment you'll find the seeds for a transforming change. That's a more self aware path to change than embracing the half truth of letting the Chinese people think: "Google? Oh yes. We have that too."
Look... as much grief as Google is getting for this, they know hackers are going to get past the wall. The Great Fire Wall of China will work about as well as the original did. It's there to make a point and it's not going to stop anyone.
meh. english romanization is not at all intuitive to non-english speakers: "cough", "ghost", "cant", "cent", "through", "trough". at least pinyin is consistent.
This is a tautology.