Slashdot Mirror


Poor Spelling Beats Google's China Filter

antifoidulus writes "CNN's money section contains a blurb(among other blurbs) about how poor spelling can beat Google's Chinese filter. The example given in the article is that a search for "Tiananmen" will yield peaceful pictures of the square, but a search for common mis-spellings such as "Tienanmen" will yield plenty of photos of tanks."

10 of 248 comments (clear)

  1. Heh. by Perseid · · Score: 5, Insightful

    Kind of reminds me of when Napster installed that half-assed search filter. Midonna and Mitallica suddenly became quite popular.

    People who want to get information will get it, and you can't stop them.

  2. Perfect Example... by oneiron · · Score: 5, Insightful

    This is a perfect example of why I've been saying all along that google is making the right decision in cooperating with the Chinese Government: http://yro.slashdot.org/comments.pl?sid=175251&cid =14571383

    1. Re:Perfect Example... by rlthomps-1 · · Score: 3, Insightful

      Damn right -- the ultimate censor is if nobody provided search services except for some sort of gov't run site where every page is cleared ahead of time.

  3. Valuable Lesson from Spammers by TFGeditor · · Score: 3, Insightful

    Who would have thought a thechnique spammers use to beat filters would have real-world value.

    Is Google's filter Baysian based?

    --
    Ignorance is curable, stupid is forever.
  4. Not for long by GoatMonkey2112 · · Score: 5, Insightful

    It would probably be better to *NOT* point these things out.

  5. The weakness of computers by ColdCoffee · · Score: 3, Insightful

    ...and so the weakness of computers is revealed: people and their presumption of perfection.

    --
    Sig? - yeah, whatever.
  6. Re:Interesting. by darkmeridian · · Score: 3, Insightful

    Google has really good suggested search terms for typos. Hint, hint. Skeet, skeet.

    --
    A NYC lawyer blogs. http://www.chuangblog.com/
  7. On Behalf of Google, Freedom, and common sense by Bandman · · Score: 3, Insightful

    SHUT UP!

    Do you want to ruin it?

    Come on, damnit! Shutupabout it.

    Consider this the "getting your foot kicked under the table" move.

    1. Re:On Behalf of Google, Freedom, and common sense by wumingzi · · Score: 5, Insightful

      This seems as good a place to bring it up as any.

      Let's do a thought experiment.

      On one side, we have a reasonably interesting search engine company.

      On the other, we have a control-minded, autocratic government.

      The search engine company (that wants to operate in China) is told by the autocratic government "We don't want Bad Things sneaking in through the search engine. Keep Bad Things out."

      The search engine company says "OK. We'll play along. Give us a list of things you don't want to see. We'll get rid of them".

      "Taiwan Independence" returns 0 results.

      "Free Tibet" is delinked.

      Various combinations of Tiananmen, 6 and 4 mysteriously vanish.

      Unfortunately, Bad Things do not fit into nice little boxes. People mis-spell words. While it is easy to come up with a list of sites that contain Bad Things you do not want to see, new sites come up all the time. Is my friend's picture gallery from Tiananmen just some postcards to the folks back come, or is there some subtle political commentary in there? Well, you'll have to read it and find out.

      If I search on (former Taiwanese president) Lee Teng-Hui, does that contain Bad Things? Does it link to Bad Things? How dangerous is a stooped 85 year-old former college professor anyhow?

      Is Ghandi axiomatically Bad? Martin Luther King? Doesteyevsky? The list goes on and on and on.

      The censors can control the obvious things. Ultimately, they will lose.

      The real problem is that China is, for all its faults, a modern country. People come in, people fly out. When I go to China, lots of people ask what's going on in the outside world. I am a little circumspect in what I say, but my memory banks don't magically get erased when I cross over from Hong Kong to Shenzhen. Over 90% of the Chinese students you see toiling away at your local research university will ultimately go home. That's just the way it goes. They too don't forget whatever subversive thoughts may have crept into their heads during five or six years of study abroad.

      The deck is stacked, and the good guys will ultimately win.

  8. Re:Obvious by 246o1 · · Score: 5, Insightful

    In Chinese, a single character ( for example -- though I'm not sure if this will display properly) represents a whole syllable (as well as a meaning or idea), rather than a consonant or vowel, as most English letters do (some are unpronounced, or just change the sound of another letter).

    This eliminates certain types of bad spellings, obviously, but opens certain avenues that aren't available in English, such as choosing characters with similar meanings but different sounds, or similar sounds but different meanings.

    For the Tiananmen example, the characters for TianAnMen () mean "Heaven," "Peace," "Gate." Heaven could be replaced with "Sky," which has a completely different sound, or "Money," which (if I rcall correctly) is pronounced "Qian" (Q sounds close to English CH). This could also happen with with the other two characters in this word, and of course for many other 'bad' words.

    The reason that common words like "pr0n" have become associated with porn, or other examples, is that a community of users agreed upon a certain misspelling of those words, and the same can and WILL happen in China to evade whatever filters search engines use. There is no way to have an even semi-open search system that doesn't allow human ingenuity to overcome its filters, and the brief history of the internet in the west indicates that these filters will, ultimately, be only partially and temporarily effective.

    --
    Although the moon is smaller than the earth, it is farther away.