Slashdot Mirror


User: gam3cub3

gam3cub3's activity in the archive.

Stories
0
Comments
5
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 5

  1. The plagiarism has been confirmed by Google on Google Faces Plagiarism Questions Over Chinese Software · · Score: 3, Informative

    Plagiarism has been confirmed officially by Google, Sohu and IDG news reporter Sumner Lemon.

    Google admits word database came from third party - Network World

    http://www.networkworld.com/news/2007/040907-updat e-google-admits-word-database.html

    An earlier report by the same reporter: Sohu to Google: Take down copycat software
    http://www.networkworld.com/news/2007/040707-sohu- to-google-take-down.html

    Google China's Official Apology to Sohu.com (in Chinese)
    http://googlechinablog.com/2007/04/blog-post.html

  2. Re:Good on Google Faces Plagiarism Questions Over Chinese Software · · Score: 1

    You are totally wrong in two aspects.


    1. There is no one-to-one mapping between Pinyin and Chinese characters, one pinyin usually corresponds to 17 Chinese characters on average. To improve the first choice accuracy on Pinyin->Chinese character conversion, the IME needs a Chinese word list as well as the corresponding word frequency information and the phonetic annotation information (One Chinese character may have several different pronunciations) as language models (think about continuous speech recognition in English). Such kind of data are usually derived from a huge corpus, manually checked/proofred search engine key words, etc. Such kind of data need a huge amount of time to maintain and are copyrighted. There are some public domain data in this area, but those public domain data have far worse performance than the proprietory data maintained by Sogou Pinyin.

    2. Sohu.com (NASDAQ: SOHU), the owner of Sogou Pinyin Input, has nothing to do with Baidu (NASDAQ: BIDU). SOHU, Baidu, and Google.cn are just three competitors in China.

  3. Re:Maybe more to the story on Google Faces Plagiarism Questions Over Chinese Software · · Score: 1

    No, as a native Chinese speaker, I can tell you that most Chinese internet users were *enraged* by Google China's recent public announcements (in the announcement, Google China, acknowledged that Google Pinyin "used" data from non-google sources, but they said those data had been removed in Google Pinyin's latest update, but they didn't acknowldge Sogou Pinyin, haven't apologize in public about their plagiarism up till now). Accoriding to Sogou programmers, there are still undisclosed Sogou easter eggs even in the latest version of Google Pinyin.

    Please check the following message for more details. "Google Pinyin's plagiarism behavior" is one of the most influencial internet news in the recent a few days in China. Many Chinese internet users found it's funny to see an american company, whose moto is "Don't be Evil", steal encrypted and copyright protected data from a competitor in China so blatantly.

    http://slashdot.org/comments.pl?sid=229975&cid=186 58353

  4. Re:Google Suggest was released in 2004 on Google Faces Plagiarism Questions Over Chinese Software · · Score: 1

    Google Suggest != Pinyin Input Method. Pinyin (http://en.wikipedia.org/wiki/Pinyin ) is a romanization system to represent the pronunciation of Chinese characters/words in alphabetical format. Pinyin Input method is a system that can transate Pinyin (e.g. Beijing) into Chinese Characters (e.g. ). Most Chinese people use Pinyin to input Chinese characters via the QWERTY keyboard. Since there are only around 400 distinct syllables in Chinese and there are around 6763 commonly used Chinese characters, one pinyin will respond to around 17 Chinese character on average, that why we need new data set/algorithms to train the Pinyin Input method to get a higher accuracy. It's completely different from what Google suggest is supposed to do.

  5. Re:evidences ARE clear on Google Faces Plagiarism Questions Over Chinese Software · · Score: 1

    There ARE numerous evidences that showed the Google Pinyin IME input method (a.k.a. Google Pinyin) indeed copied the data libriary of Sogou Pinyin IME input method's (a.k.a Sogou Pinyin). Developers of the Sogou pinyin created some easter eggs in their products (e.g. all the names of the Sogou develpement team members, a few spelling typos), Programmers of Google China copied all these easter eggs and typos verbatium to their Google Pinyin product verbatim.

    Sohu.com (NASDAQ: SOHU), the owner of the Sogou Pinyin, accused Google China's plagiarism behavior in their official announcement today (in Chinese), asking Google to stop the copyright infrigment, apologize in public media to SOHU.

    http://tech.sina.com.cn/i/2007-04-08/17041454175.s html

    The PR officer of Google China (NASDAQ: GOOG) also released an official response a few hours later today (in Chinese).

    http://tech.sina.com.cn/i/2007-04-08/18351454194.s html

    Google China's official response acknowledged that "the Google Pinyin IME Input method included some data not created by Google itself, and those data have been removed in the latest update". Google China's offical announcement still didn't acknowledge the original data creator, didn't appologize for their copyright infrigement either. Accodring to SOHU, there are still undisclosed "easter eggs" created by Sogou Pinyin programmers even in the latest update of Google Pinyin.

    FYI: Here are the screen shots of a few easter eggs and typos in Sogou Pinyin, which are found in Google Pinyin verbatium.

    http://www.donews.com/Content/200704/69ce12fbc8264 b76b78f44791dad8379.shtm

    http://www.donews.com/Content/200704/69ce12fbc8264 b76b78f44791dad8379.shtm