Slashdot Mirror


Google Brings Offline Neural Machine Translations For 59 Languages To Its Translate App (techcrunch.com)

Google is rolling out offline Neural Machine Translation (NMT) support for 59 languages in the Translate apps. Some of the supported languages include Arabic, Chinese, English, German, Japanese, Spanish, French, and Korean (TechCrunch has a full list of the languages in their report). From the report: In the past, running these deep learning models on a mobile device wasn't really an option since mobile phones didn't have the right hardware to efficiently run them. Now, thanks to both advances in hardware and software, that's less of an issue and Google, Microsoft and others have also found ways to compress these models to a manageable size. In Google's case, that's about 30 to 40 megabytes per language. Users will see the updated offline translations within the next few weeks.

46 comments

  1. Meanwhile by Anonymous Coward · · Score: 0

    Meanwhile, ol' Slashdot here still occasionally borks English. Part of the charm I guess.

  2. Re:Oh brother by Anonymous Coward · · Score: 0

    Fluoride in the vodka again, Beau?
     
    I TOLD you, rainwater, not tap water.

  3. Re:Overkill by Anonymous Coward · · Score: 0

    Other languages are all about us not knowing when they're planning to murder us. That is why they refuse to learn English so we don't know what they be saying. That is why if you don't understand someone that you need to call 911 and flee screaming for help.

  4. Save translations in the SD card by williamyf · · Score: 1

    Will google allow to save this NMT translations (@40Meg a pop) on the SD card?

    Currently, they DO NOT allo to save these translations to the SD card in android, while Microsoft Translator does.

    Is the only reason I use the Microsoft app and not both (for the languages I care about both are about the same, so I'd be delighted to have both).

    --
    *** Suerte a todos y Feliz dia!
  5. Re: Overkill by Anonymous Coward · · Score: 0

    Donâ(TM)t ever say you donâ(TM)t speak one of their garbage languages. Theyâ(TM)ll rape and murder you for that.

  6. pivot language? by epine · · Score: 2

    Is English considered to be the pivot language, or do all of these models product the same intermediate representation?

    Rather useless article, with no shred of a deep understanding, whatsoever.

    I'm guessing you run the input model from language to IR, and the output model from IR back to language, so you need to have at least two models to use this app. (I suppose you could translate from English to IR and back to English again, for perverse joy.)

    Only I haven't read anything about training multiple machine translation models with a shared IR. That strikes me as technically difficult, and I would have thought I'd have seen some loud crowing out there, had it been achieved (it's now been a couple of months since I gave the Internet a good shake on machine learning, and things move fast).

    1. Re:pivot language? by Lenbok · · Score: 3, Informative

      I'm not sure to what extent it relates to the specific offline translation modules in the translate app, but a while back the Google Research blog had a post on multi-lingual machine translation models (and that let them do translation between two languages for which they didn't have direct translation training corpus). So at least in that case, there is just a single translation model rather than separate input and output models that go to and from an IR.

      https://ai.googleblog.com/2016...

    2. Re:pivot language? by Anonymous Coward · · Score: 0

      Yes, my impression is that English is still the pivot language for most language pairs, which doesn't work well and produces all kinds of gibberish.
      What do you mean by IR in Google Translate? Of course the neural network has (volatile) internal state, but it doesn't produce a string of intermediate bytecode in some abstract grammar. Models are trained per language pair.

    3. Re:pivot language? by AmiMoJo · · Score: 4, Interesting

      I don't have that detail but I can tell you my experience as a user of machine translation for 15+ years.

      Originally it only really worked on formal documents, and even then only produced something you could barely understand. The biggest issue seemed to be that it didn't understand context at all.

      Google made some early improvements in making the translated text sound more natural. They also managed to fix a lot of common phrases that didn't quite fit the standard grammar model and thus didn't used to get translated properly. Apparently they did that by using the web as a resource for natural language and by allowing users to submit corrections.

      Then AI started to be used. Baidu were the first I think and their Chinese/English translation was a huge improvement over everything else. It seemed to work slightly better going from English to Chinese though, and when Google released their AI updates not long after Chinese to English became nearly perfect.

      It's actually incredible how good it is now. Often the resulting translation is not only accurate and seemingly context aware, it sounds like something a person might actually say. You don't have to think about what you are writing either. Before you had to be careful to phrase things so that the software could understand it, but not any more.

      There are still some issues, like the way Japanese newspaper headlines often get translated as if it was a person speaking about their own experience (e.g. some houses were flooded, but the translation is "my house was flooded" because the software assumes that context), but for conversations between two people it's like Star Trek or something.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    4. Re:pivot language? by Anonymous Coward · · Score: 1

      but for conversations between two people it's like Star Trek or something.

      I think you must be using a different Google Translate than I do.

      I do ja->en quite often, usually on informal (but not in any way slangy) text and dialogue and I honestly cannot think of the end result as anything more than source of amusement. That's for English; it's much worse to/from my native language...

      I'll have an example now (romanised, because slashdot). Just a random sentence I had lying around.

      Source text:

      ore ha mitame kara tekkiri aoyanagi no hou ga tosiue da to omotte ita kara, siturei wo syouti de tazuneta.

      Google:

      I thought Aoyagi was the older from the appearance, so I asked for rude and asked.

      Manual translation:

      I was certain that Aoyanagi was the older one of the two, so though I knew that asking would be rude, I decided to enquire:

      Luckily I am reasonably proficient in Japanese so I do not have to rely on the translation. I mostly use it to be able to quickly skim a paragraph of text.

      If I had to rely on Google for my understanding, I'd be very sad...

    5. Re:pivot language? by Anonymous Coward · · Score: 0

      I was certain that Aoyanagi was the older one of the two, so though I knew that asking would be rude, I decided to enquire

      Watashi wa, aoyagi ga futari no uchi no furui katadatta koto o kakushin shite itanode, tazuneru koto wa shitsureidearu to shitte imashitaga,

      I was aware that the villa was the sifter of the two, so I was told that she was obsessive,

      [repeat]

      I was told that she is obsessed because I knew that the villa was two brothers,

      I knew that the villa was two brothers, so it was said that she is obsessed.

      I knew that the villa was two brothers, so she was said to be obsessed.

      I was told persistently that I knew that the villa was two brothers.

      I was told strongly that I knew that the villa was two brothers.

      I strongly told that the villa knew that it was two brothers.

      I strongly said the villa knew it was two brothers.

      I strongly said the villa knew it was two brothers.

      I strongly said the villa knew it was two brothers. ...

    6. Re:pivot language? by TheDarkMaster · · Score: 1

      Here the Google translator never, ever gets Brazilian Portuguese (or even Portuguese) right. he is unable to understand the portuguese verbal agreement, the correct order of verbs, and sometimes simply invents expressions that have nothing to do with the original text, to be left alone in the most obvious problems. As a Brazilian Portuguese speaker I have to first "translate" what I mean in the most basic and simple possible way or the translator will completely fail to execute the translation.

      --
      Religion: The greatest weapon of mass destruction of all time
    7. Re:pivot language? by AmiMoJo · · Score: 4, Informative

      Try using Google Translate and Bing Translate on a random story from srad.jp. Srad used to be Slashdot Japan before the name change, and the story summaries are written in an informal tone similar to Slashdot ones (but better edited!)

      Google:


      Ministry of Economy, Trade and Industry, Internet mail order order due to erroneous judgment of smart speaker will not be contracted

      First of all, about electronic commerce using smart speakers, smart speakers have the ability to order orders to net mail orderers by voice. However, when an order occurred due to misunderstanding or misunderstanding, guidelines on how to handle that order were not shown. In this revised bill, it is clearly stated that "contract through AI speaker has not been established" for misrecognized orders, and it is said that businesses must properly deal with these problems. Also, even if the ordering party makes a mistake, if the system is such that confirmation is not made for the order, the ordering party may be able to argue the invalidity of the contract.

      You can see some trivial mistakes, like how two different words in Japanese are translated into the same English word "misunderstanding", but the meaning is clear and things like the ministry name are correct and the sentences are actual English.

      Bing:

      Due to misjudgment of transdermal production Ministry said, smart speaker e-store purchase contract would suppose

      There are features for e-commerce using the first source to speaker smart speaker audio to Internet mail order companies order allows. But the positives and say mistakes in order occurs, the order what to do of the guidelines was not shown. We're in this amendment, and describing "the agreement through the AI speakers has not been established" in probable order operators must respond properly on these issues. In addition, says is possible if the system check do not for the order if the buyer did mistakes, the officials can claim contract invalid.

      It's like something out of the 90s era Bablefish. Not only is the interpretation of the original Japanese poor, but the resulting English sentences are broken too.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    8. Re:pivot language? by AmiMoJo · · Score: 1

      I guess it's going to be different for every language, depending on how much time the have put into it, if they have staff who speak that language on the team etc. It seems to work okay for Spanish and French, but I don't know how different Portuguese is...

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    9. Re:pivot language? by Hognoxious · · Score: 1

      Is English considered to be the pivot language, or do all of these models product the same intermediate representation?

      Pragmatically speaking it's a good choice, since from the users' POV to & from English would be priorities anyway.

      Technically I'd say it's appalling. Irregular, words with multiple meanings, phrasal bloody verbs and did I mention it's irregular?

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    10. Re:pivot language? by laie_techie · · Score: 1

      Here the Google translator never, ever gets Brazilian Portuguese (or even Portuguese) right. he is unable to understand the portuguese verbal agreement, the correct order of verbs, and sometimes simply invents expressions that have nothing to do with the original text, to be left alone in the most obvious problems. As a Brazilian Portuguese speaker I have to first "translate" what I mean in the most basic and simple possible way or the translator will completely fail to execute the translation.

      What do you expect? Look at how many words have multiple meanings! Is [i]canjica[/i] sweetened cream corn with cinnamon or a dessert made of hominy, coconut milk, shredded coconut, and cloves? Is [i]bomba[/i] bomb, pump, or a pastry in this context? How should the English [i]to be[/i] be translated in this context (ser, estar, ficar, passar)? Is [i]manga[/i] a sleave, mango, or Japanese art? Is yucca called [i]aipim[/i], [i]mandioca[/i], or [i]cassava[/i]? Are you talking about a computer mouse ([i]mouse[/i]) or the rodent ([i]ratinho[/i])? How many Brazilians drop the 'r' at the end of infinitive verb forms. Human language is hard for machines to get 100% correct 100% of the time.

    11. Re:pivot language? by jebrick · · Score: 1

      I just have experience with the MS translation APIs but I am guessing that Google will be similar. With MS, you can "train" their cognitive hub in a language by putting paired documents into it. They need to be the same document that has already been translated. The Hub learns and is able to (eventually) pick up tone. It is useful in industries that can have more specialized language usages.

      Downside is that a company must have one "Hub" for each language pair. Takes a lot of training.

    12. Re:pivot language? by Anonymous Coward · · Score: 0

      > Is English considered to be the pivot language, or do all of these models product the same intermediate representation?

      If anything they should be using french as the "master language". There was a reason why international treaties used to be written in french and the name of many int'l organisations is still in french, be it sports or science. English is very imprecise and excessively infatuated with ambigious abbreviation, that can cause only trouble.

      Apparently, the most precise language on Earth is arabic, it differentiates so many shades of a statement and that was one factor why Middle East was the place to preserve the sci/art knowledge of antiquity while Europe was in the Dark Ages. Furthermore, the 3 or 4-syllable word root tree system of arab, hebrew and other the afro-semitic languages is a dead ringer for the platonic ideals scheme, making those languages superbly conductive for logic-based thinking.
      On the other hand the great difficulty of printing arabic script (intricate calligraphy) in the pre-computer era made it impossible to mass-produce books vs. hand-written parchments, causing the muslim world to decline while renaissance and early modern Europe was emerging in the fields of science, arts and tech.

    13. Re:pivot language? by Anonymous Coward · · Score: 0

      > I do ja->en quite often, usually on informal (but not in any way slangy) text and dialogue and I honestly cannot think of the end result as anything more than source of amusement.

      Some years ago there was a /. story about Fuji (?) of Japan offering a photocopier with built-in japanese to/from english translation capabilities, supposedly for automatic localization of tech blueprints and user guides, based on a semi-cloudified background infrastructure. Then they looked at the stats collected by early prototypes and found by far the most profilic words were "mahou shojo, imouto, siscon, seifuku, tsundere, lolicon, isekai, ZR, baka, sempai, ecchi, kawaii, etc." and decided not to market the manga scanlation auto-mail concoction...

    14. Re:pivot language? by Anonymous Coward · · Score: 0

      > Source text: ore ha mitame kara tekkiri aoyanagi no hou ga tosiue da to omotte ita kara, siturei wo syouti de tazuneta.

      You already have a problem there, because Hepburn transliteration is ambigious in and itself, as it cannot be considered true phonetics. Furthermore, japanese language printed in "romaji" is always ambigious, even when transcribed to latin script with honest phonetics (e.g. using the hugarian alphabet). That's because living spoken japanese heavily depends on face-to-face parties gesticulating to augment the restricted set of sounds and phrases existing in their language. They have difficulty over the telephone.

      This problem is partly fixed in writing by their use of 4 different alphabets (hiragana, katakana, kanji, furigana) where different calligraphies can resolve to the same spelling but carry very different meanings. Some jokes, fixed phrases, old wisdoms just don't make any sense in japanese unless written in calligraphy.

      It would be dishonest to expect a computer or even a true AI to produce consistently good english (or any indo-european language) translation from a japanese corpus which is already "castrated" by being shoe-horned into the latin alphabet. There was a reason Japan decided not to latinize after WW2, even though they were totally at the mercy of US occupiers and under heavy pressure to fully westernize.

    15. Re:pivot language? by Anonymous Coward · · Score: 0

      I won't argue about google's being superior, but if that is your comprehension of English as being correct that argues for you lacking anything more than a pidgin grasp of the language. The language ranges from awkward to stilted to wrong.

    16. Re:pivot language? by Anonymous Coward · · Score: 0

      native english speaker here living in a foreign country. This is *FAR* better than 90% of the locals who try to write 'complex' legalistic english in my workplace, which unfortunately deals with a lot of legalistic english. :(

      to get more accurate you'd suddenly have to jump to a proficiency that is higher than most native high schoolers in the US. Seriously grab a random 15 year old an have them write a legalistic document.

  7. Re:Overkill by Anonymous Coward · · Score: 0

    For their kind, they're proud of their ignorance and will get violent with anyone that points out their lack of IQ.

  8. Re:Overkill by Anonymous Coward · · Score: 0

    Immigrants used to learn the language, but not they're apparently not smart enough to do so.

  9. Re:Overkill by omnichad · · Score: 1

    the ganja language

    Hindi?

  10. Re:Overkill by Anonymous Coward · · Score: 0

    That is why they refuse to learn English so we don't know what they be saying.

    They "be" saying?

    Ebonics was not on the list of languages.

    Posting as Anonymous so I don't get nailed in the bad karma fallout from this thread.

  11. maybe by phantomfive · · Score: 2

    I'm not sure the new model is actually better than the old model. In recent months, I've seen it make bizarre mistakes, like translate "man" as "woman" in contexts where there was no room for mistake. Also it translated 10,000 as a million. Something is wrong with it.

    --
    "First they came for the slanderers and i said nothing."
    1. Re: maybe by rworne · · Score: 1

      Man as woman?

      Is it possible that it is just trying to be PC and substitute woman for man when the context isn't necessarily important?

      Like the phrase "Every man for himself" would occasionally translate to "Every woman for herself" - you know, for the sake of inclusiveness.

      Does it ever screw up the other way around?

      --
      I tried every decent and legal way I could think of to resolve the issue w/the business before I rented the chicken suit
    2. Re: maybe by phantomfive · · Score: 1

      It might have been trying that, but it was wrong. It was referring to a concrete man in every way, gender, sex, self-identification......it was just wrong.

      I also found this gem, "colleag" which is not even English.

      --
      "First they came for the slanderers and i said nothing."
    3. Re: maybe by Anonymous Coward · · Score: 0

      I've mostly used machine translation for Japanese to English. Apple's is pathetically poor and Google's is sometimes almost good enough to be useful, but I've never seen it get more than a trivial amount of text even close. As in, parts of a sentence being okay but paragraphs are (relatively) gibberish.

      The worst problem is when it is simply flat out wrong, where it translates the opposite of the intent. For example, the previous sentence being rendered as "The best problem is when it is just correct" or some other nonsense.

      Maybe some day machine translation will actually reach the quality of OCR* but I'm not holding my breath. Currently machine translation is only useful for casual work where any comprehension (even wrong) is better than none.

      * OCR accuracy, outside of certain standardized texts, seems to have peaked at about 95% accuracy. Which makes it useful for cases where there is not going to be any manual entry, much less proofing, but it is generally faster to simply type the text in than spend the time proofing it. Sadly, there was a DOS program for OCR that was functionally superior to anything current simply because its interface was designed to facilitate proofing. The accuracy was at least 90% and it would cycle through each non-identified glyph showing the user what it couldn't grok. If it was an 'a' then hitting 'a' followed by return advanced to the next. If it was 'te' then hitting 't', 'e' then return advanced. Very simple, very quick with a final text having close to 100% accuracy. But the emphasis went to unattended OCR as a passable substitute for having the actual text.

  12. What?????? by lucasferraz · · Score: 1

    Is this true?

    --
    https://sitefactory.com.br/pt-br
  13. Re:Overkill by Anonymous Coward · · Score: 0

    the ganja language

    Hindi?

    Rasta

  14. Re:Oh brother by Anonymous Coward · · Score: 0

    Lol, beau. Are you the lovechild of nitrous and pot? Miss Mash is obvious, but YOU...

  15. Re:Oh brother by Anonymous Coward · · Score: 0

    msmash is a bot. Beau is a real human being. And a sorry excuse for one.

  16. I wish someone could translate ebonics by Anonymous Coward · · Score: 0

    Yo n da past, running deez deep learning models on uh mobile device wasn't really an option since mobile phones didn't gots da right hardware ta efficiently run dem. Now, thanks ta bofe advances in hardware an' software, dat's less o' an issue an' Google, Microsoft an' others gots also found ways ta compress deez models ta uh manageable size. In Google's case, dat's 'bout 30 ta 40 megabytes per language Jus' like Orenthawl James.

  17. how about chinese numbers? by Anonymous Coward · · Score: 0

    Have they learned yet to translate Chinese numbers?

  18. Google has been bad at this for a long time by CaffeinatedBacon · · Score: 1

    Google will do that 95% of the time. Even if it was 100% wrong consistently you would be able to work around it, but it's random. And really quite bad. If you didn't already know what number to expect you wouldn't even know it was wrong.

    Chinese have a concept of 10,000 being a standard division for counting, so things will be measured in 10k's but Google changes them to either just thousands or millions instead depending on how it feels that particular time.

    Not quite the same as Windy just making up numbers, but the effect is similar. Completely unreliable results and not trustworthy without alternate sources.

  19. Re:Overkill by Anonymous Coward · · Score: 0

    No they're now.

  20. Privacy opportunity by mrwireless · · Score: 1

    The main reason I like these 'edge computing' developments is that they give access to advanced functionality without constantly reporting to the cloud what you are doing.

    Then again, this being Google, I suspect this opportunity has not been taken..

  21. Available for several months by TuringTest · · Score: 1

    Uh? I had this enabled several months ago. Is this one of those features that gets rolled progressively, and I was lucky?

    Or may be it's because I'm not in the US and they launched it at countries with non-English languages before?

    --
    Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
  22. Re:Oh brother by Anonymous Coward · · Score: 0

    And some people wonder how creimer was able to issue DMCA takedown notices to Russian websites.

  23. the only important question by slashmydots · · Score: 1

    But the 1 remaining question on everyone's mind: have the fixed the "beep beep lettuce" translation?

  24. Re:Overkill by omnichad · · Score: 1

    It's pretty likely that Hindus moving to Jamaica are who originally introduced cannabis.