Slashdot Mirror


Google Faces Plagiarism Questions Over Chinese Software

yaohua2000 writes "Google's laboratory in China has launched its first product, a Pinyin Input Method Editor. The software allows the romanized characters to be translated to more traditional Chinese symbols , via entering on a QWERTY keyboard. Users soon discovered that the data Google used for the product was unusually similar to the data used by a Chinese rival, Sogou. Google has evaded the question about software similarities, reports PC World. 'The similarities, which included an error involving the name of a celebrity, were noted on a Google Labs discussion board about its Pinyin IME. Users noted that entering the Pinyin pinggong into the Google IME incorrectly produced the name of Feng Gong, an actor and comedian.'"

16 of 187 comments (clear)

  1. Identical typos... by pedantic+bore · · Score: 4, Insightful
    Funny, that's how we catch students who plagarize, too.

    Coming up with the same algorithm isn't terribly unlikely. Structuring it in the same way is not uncommon either. Making exactly the same mistakes, however, is hard to believe.

    --
    Am I part of the core demographic for Swedish Fish?
    1. Re:Identical typos... by Plutonite · · Score: 5, Insightful

      Not really. I'm not defending Google here, but you seem to be talking about an essay not an algorithm. If you have algorithms that are similar enough, they do not even need to be "structured the same way" to produce the same output(errors included). Anybody who has been to an ACM contest will tell you this.

      As such this story is useless. The internet needs no more speculation as it is, it's hard enough arguing what is wrong or right when concrete evidence is available. Our flamewars should be founded on solid ground.

    2. Re:Identical typos... by ReallyEvilCanine · · Score: 5, Insightful
      According to TFA, Sohu has patents in several areas related to how popular Internet search terms can be used for predictive text input. Google does, too. And unlike most others, Google constantly tweaks algorithms. Have you noticed how the Google Toolbar now predicts your search terms? And every time you deviate, they do modifications for you personally and tabulate in general to see if other's are also going after such similar versions.

      I work in I18N and deal with IMEs all the time, from the basic, non-learning MS Windows versions to the ones which come with the NJ Star and give preference to lesser-used terms previously selected to various other proprietary variants. There are only so many ways to write an IME, and there are only so many ways to do good prediction. If I type "go" in Japanese, my first choice will usually be "5" followed by the symbol for "language" and the game "Go", then various other possibilities. Only when I next type a "z" or a "g" do the symbols for a.m. and p.m. move to the front. Now if I'd written an IME and wanted to protect it I might have it always bring up "Mifune Go" ( as the fifth selection or, more subtly, bring up "Go" as the fifth possibility if you typed a "G" or "Go" after "Mifune". This isn't the case here.

      With Google's work and implementation of prediction methods, I find it hard to accuse the company of plagiarism for having the same bug (which comes as a result of predictive methods) as some other company. This is a bug, not some zyzzyx or easter egg which a programmer included to catch thieves. It was unintentional on Sogou's part and likely equally unintentional on Google's.

      Then again, there's a lot of pressure to excel at Google and maybe someone gave in to temptation despite working for a company that knows more about data than anyone else out there. Unlikely, but possible... and if Google issue a statement that someone did indeed plagiarise Sohu's work, fine. It could happen anywhere. Doesn't make Google bad, only one programmer. It makes the company culpable, but it hardly looks malicious.

  2. Ironic, isn't it? by catdevnull · · Score: 2, Insightful

    Of all the countries in the world to bitch about someone stealing or copying...

    --

    I might know what I'm talkin' about, but then again, this is Slashdot...
    1. Re:Ironic, isn't it? by Anonymous Coward · · Score: 1, Insightful

      why is such a blatantly prejudiced comment modded insightful?
      shall i make a comment about the us being nothing but greedy lying bullies because microsoft or diebold is from the us?

    2. Re:Ironic, isn't it? by fermion · · Score: 2, Insightful

      OTOH, google is desperately trying to show that it offers an original and innovative product, and does in fact owe it profit to stealing and repacking the content of others. The lifting of code sort of indicates that the case is the former and not the later, and may tend to have an impact in cases where Google is claiming it need not make royalty payments.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    3. Re:Ironic, isn't it? by ScrewMaster · · Score: 3, Insightful

      Strange how you wouldn't have said this if it was Microsoft.

      You're definitely new here. We complain about Microsoft pinching other people's work continuously here on Slashdot, mainly because Microsoft does, continuously. We also regularly bitch about how the current patent and copyright systems here in the United States are seriously flawed. And the OP is correct in pointing out that China has always been, shall we say, less than respectful of others' rights in this regard ("blatantly ripping them off" is as good a description as any.)

      What was your complaint again?

      --
      The higher the technology, the sharper that two-edged sword.
    4. Re:Ironic, isn't it? by ScrewMaster · · Score: 4, Insightful

      Why is it that saying anything negative about another country is always turned into a discussion about racism and bigotry? It immediately poisons further dialog when it is applied without reason. If you have some reason to think the OP is prejudiced I'd like to hear it, because I didn't read that into his comment. I hear a lot of negative comments about the United States on Slashdot (yours, for one, which is interesting) but I don't immediately conclude that prejudice is the root of it. Sometimes it is, but it's nice to find that out first before jumping to any conclusions.

      The unfortunate fact of the matter is that China's government and industry are completely unconcerned about the source of the technology that they mass-produce and sell to everyone. They just don't care, period, and I suppose when you get right down to it there's no reason they should. On the other hand, that just means there's no reason why we should respect their "intellectual property" either, and when their scientists and engineers come up with something good they damn well shouldn't expect us to concern ourselves over their rights either. If Google did indeed rip off their Chinese counterparts my feeling is ... more power to 'em.

      So, it's not a statement of prejudice (e.g. "I dislike Chinese people because they are Chinese, or have yellow skin, or slanted eyes, or talk funny") but a legitimate observation on the state of affairs in that country.

      Just watch it when you start playing the race card without a good reason ... it prejudices any argument you make after that point.

      --
      The higher the technology, the sharper that two-edged sword.
    5. Re:Ironic, isn't it? by Gwwfps · · Score: 2, Insightful

      The race card is actually perfectly played here.

      If Google did indeed rip off their Chinese counterparts my feeling is ... more power to 'em.

      If you said "If Google did indeed rip off a competitor who ripped off previously..." or "If Google did indeed rip off their Chinese counterparts my feeling is that they are just in an environment where this is not a big deal.", then you might have some credibility. Instead, you are now advocating plagiarising all Chinese IPs because an admittedly large number of companies and individuals in China do not respect Western IPs. Sogou never did violate any Western IP, why are they harmed in this? You might not have realized this, but by what you have said the answer would have been "because they are Chinese". If advocating hurting a company because of their country of origin, and let's face it, that would mean race in China's case, is not racism, I don't know what is.

  3. not saying it's the case by creativeHavoc · · Score: 5, Insightful

    while i am not insisting that it is the case, it seems like it could easily be the same logic flaw. Different algorithms and code can produce the same mistake if you are using the same mis guided logic behind the problem. Thats why you see the same bugs in students' code in university, even when worked on separatly during a lab.

    --
    insight through the mind
  4. Re:This wouldn't be the first time... by limecat4eva · · Score: 2, Insightful

    If you didn't want society as a whole to benefit from your code, why did you release it under an open source license in the first place? God almighty, you GPL whiners are the best argument going for BSD-style licenses.

    --
    comma
  5. Or, basically... by mattgreen · · Score: 4, Insightful

    "This is our groupthink, it doesn't need to make sense. Now shut up and conform so you get your mod points!"

  6. Re:This wouldn't be the first time... by Anonymous Coward · · Score: 5, Insightful

    the dozens of person-years that went into writing the actual dictionaries for aspell were simply co-opted by Google. Get off your high horse - you're just another holy roller.

    Thousands of people donate their time, money, and code to GPL-licensed projects. As one of those contributors, I can tell you that I don't believe that Google is doing anything wrong at all with aspell. The terms of the license are clear. Users are no way required to give attribution. In fact, there is not even a suggestion, hint, or implication that attribution would be nice. You suggesting that it should be that way is fine, but to state that aspell was "co-opted" is factually incorrect and falsely implies that Google is doing something against the GPL license.

    If you, as a contributor to aspell, don't like aspell's license terms, you are free to start another project with similar goals under different license terms.
  7. Combing by eMbry00s · · Score: 5, Insightful

    Everybody who says something along the lines of "bah, chinese complaining about stealing" should note that all Chinese are not connected into one single conscious entity, but are different individuals.

    The people who own this IP need not have stolen any other IP.

    It is as dumb as saying that all Americans are christian, guntouting, fat fuckasses.

  8. Re:My girlfriends pussy.... by microbee · · Score: 2, Insightful

    This is not just about China. Both GOOG and SOHU are NASDAQ companies, and the software is released to the world (including US). So SOHU could sue GOOG.

    If GOOG or whatever US companies think a Chinese company infringed on their rights, they can sue, instead of whining on online forums.

    So, what's your exact point?

  9. Re:Do no evil my ass by Anonymous Coward · · Score: 1, Insightful

    I have no idea why you can't edit your posts here
    One word: goatse. If people could edit posts, they could make a +5 post, get it visible, and then edit it so it showed a giant ascii ass. Sure, you could revert the moderation, but the situation still sucks.

    Or, they could "flash" a goatse, meaning they would make a nasty FP, goatse the first few people to view the thread, and then change it to something acceptable before moderators could mod it down, thus saving karma. As one who is experienced in this sort of thing, it generally takes a fair bit of time before a moderator comes on the scene. Slashdot's karma system is a game, and adding editing would totally fuck with the rules.

    Besides, I think it is preferable to make a single, immutable post that is archived for eternity.