Google's AI Translation Tool Creates Its Own Secret Language (techcrunch.com)

← Back to Stories (view on slashdot.org)

Google's AI Translation Tool Creates Its Own Secret Language (techcrunch.com)

Posted by BeauHD on Wednesday November 23, 2016 @12:45PM from the rise-of-the-machines dept.

After a little over a month of learning more languages to translate beyond Spanish, Google's recently announced Neural Machine Translation system has used deep learning to develop its own internal language. TechCrunch reports: GNMT's creators were curious about something. If you teach the translation system to translate English to Korean and vice versa, and also English to Japanese and vice versa... could it translate Korean to Japanese, without resorting to English as a bridge between them? They made this helpful gif to illustrate the idea of what they call "zero-shot translation" (it's the orange one). As it turns out -- yes! It produces "reasonable" translations between two languages that it has not explicitly linked in any way. Remember, no English allowed. But this raised a second question. If the computer is able to make connections between concepts and words that have not been formally linked... does that mean that the computer has formed a concept of shared meaning for those words, meaning at a deeper level than simply that one word or phrase is the equivalent of another? In other words, has the computer developed its own internal language to represent the concepts it uses to translate between other languages? Based on how various sentences are related to one another in the memory space of the neural network, Google's language and AI boffins think that it has. The paper describing the researchers' work (primarily on efficient multi-language translation but touching on the mysterious interlingua) can be read at Arxiv.

69 comments

Min score:

Reason:

Sort:

like that Arrival movie? by turkeydance · 2016-11-23 12:47 · Score: 1

if it's so secret, then no comms
1. Re:like that Arrival movie? by xtsigs · 2016-11-24 02:08 · Score: 1
  
  if it's so secret, then no comms
  Secret to us, but not secret to other AIs. Execution of any coup is highly dependent on rapid, secure communications. Now that we know the AIs are laying the groundwork, what are going to do about it?
2. Re: like that Arrival movie? by Anonymous Coward · 2016-11-24 05:07 · Score: 0
  
  Perhaps a method to monitor the state of the memory and so, electrical currents and their patterns. Deviations should be noticeable.
No, this seems wrong by Anonymous Coward · 2016-11-23 12:49 · Score: -1

The translation system is using English as a "baseline" language. It knows how to translate both Korean and Japanese to/from English. So it's implicitly using English to link the two languages.
It isn't magic. A human would not be able to translate Korean to Japanese without some intermediate language either.
1. Re:No, this seems wrong by fisted · 2016-11-23 12:51 · Score: 1
  
  TFS seems to disagree.
  
  --
  CLI paste? paste.pr0.tips!
2. Re:No, this seems wrong by CaptainDork · 2016-11-23 12:56 · Score: 2
  
  Tell that to the Korean translators.
  
  --
  It little behooves the best of us to comment on the rest of us.
3. Re:No, this seems wrong by Anonymous Coward · 2016-11-23 12:58 · Score: 1
  
  > A human would not be able to translate Korean to Japanese without some intermediate language either.
  Why would a human not be able to do this?
  Do you think Koreans translate to English, or some other language before translating to Japanese?
4. Re: No, this seems wrong by Anonymous Coward · 2016-11-23 13:09 · Score: 0
  
  This guy is a fucking idiot.
5. Re: No, this seems wrong by Anonymous Coward · 2016-11-23 13:09 · Score: 0
  
  I'm sure those dummies at Google didn't consider that.
6. Re:No, this seems wrong by vux984 · 2016-11-23 13:30 · Score: 2
  
  On the other hand TFS is basically gibberish.
  There is no 'secret' language, or even deeper understanding. The notion that they aren't using english as a bridge language just means that they aren't translating Japanese-to-English-to-Korean.
  But for example... if I train you that cat = gato in italian, and that cat = chat in french. And then ask you to spit out the french if give you the "gato" that's not exactly magic. It looks up 'gato' in italian and sees a reference to "chat". And it can do this without explicitly looking up the english "cat" and then feeding "cat" back in to look up the french.
  English is still the bridge language that was used to train it.
  Now this neural network is a lot more complex because lanaguage is a lot more complex than simple word substitutions but the neural network is still basically encoding that chat (french) = cat (english); and cat (english) = gato (italian) and the way this information is mapped into the neural network -- that it can now retrieve equivalencies between french and italian without being EXPLICITLY trained on them.
  Its neat... but whoo... the neural network structure inherently models the transitive property of equivalence. That's kind of the whole point of the thing (to effectively build a weighted mapping of language equivalences) so it would almost be more surprising if it couldn't do some reasonable transalations between languages it wasn't explicitly trained on -- because english is the bridge between them in how the knowledge was built even if they aren't explicitly using english now.
  I mean... train it on english to french, and train it on japanese to korean, and see if it can go from korean to english. It won't. Because it won't have ANYTHING to bridging those two sets of knowledge.
7. Re:No, this seems wrong by Rei · 2016-11-23 13:46 · Score: 5, Informative
  
  But for example... if I train you that cat = gato in italian, and that cat = chat in french. And then ask you to spit out the french if give you the "gato" that's not exactly magic. It looks up 'gato' in italian and sees a reference to "chat" ...
  the neural network is still basically encoding that chat (french) = cat (english); and cat (english) = gato (italian)
  That would be nice if translating sentences was the same as looking up words in a dictionary. It's not. So pointing out that there are words that have correspondences is meaningless.
  Languages have a fuzzy haze of concepts and ways to parse them. I could say "I feel sick" or "I am sick" in English and they're not the same, the latter expresses certainty. But in Icelandic you'd generally say "Ég er lasin(n)" or "Ég er veik(ur)" - aka, "I am sick" - for both of them. Not "I feel sick". You *can* say "I feel as if I'm sick", but that gives a sort of connotation as if you're doubting yourself, more than "I feel sick" does in English. The latter case is "Mér líður eins og ég sé veik(ur)", which is literally "Me (dative, not nominative) feels same and I would-be(pres.) sick (depends on gender)" There's an awful lot going on in there that a word-for-word translation just doesn't catch. Even if you catch phrases, like "eins og" -> "like" rather than literally "same and", you still don't have anything close to a one-to-one mapping.
  And here we're talking two Germanic languages.
  A neural net that can handle translations in a way where the results aren't terrible must have a concept of the fuzziness, the interplay of how different concepts are presented in different languages. And indeed, that's what the graphic that they show seems to suggest, where you have these branching clusters with varying pathways that dart between them for different languages. Perhaps calling that internal representation a "secret language" is a stretch, but it's most definitely nothing like having "English as a bridge language".
  
  --
  Wingus, Dingus! Listen up!
8. Re: No, this seems wrong by Anonymous Coward · 2016-11-23 13:47 · Score: 0
  
  As others have pointed out your bullshit already I'm just going to mention: Korean to Japanese translation is probably two of the easiest languages to directly translate to each other. Grammar is nearly 1:1. Chinese-based words sound similar and have similar hanja/kanji/hanzi. What's left to translate is the cultural aspects.
9. Re:No, this seems wrong by Rei · 2016-11-23 14:10 · Score: 4, Informative
  
  To follow up a bit further on that, there are some concepts that take whole sentences, paragraphs or more to describe. Back in the day I had a Japanese song, with English lyrics... except that one word in the middle remained untranslated ("Our satori are just floating in the core"). I asked a professor about what it means and it ended up as a whole lecture on Buddhist concepts and Japanese relations between the true self and the self that one presents to others in different contexts.
  In Icelandic for me it often comes up in terms of geological terms. For example, someone will ask, "What does Reykjavík" mean, and I usually just give a quick "Smoking Cove" or "Smoking Bay" or something like that. But that's not really right, English doesn't really have a word that describes a "vík". A "vík" is where the coastline "víkur". To víkja is to give way, like if someone's tailgating you on the road and you pull off to the side to let them past. So where the coastline "víkur" - on a certain scale, at least - that's a "vík". It's often where a river empties out, but not all river mouths end in víkur, and not all víkur are river mouths, some are more like coves or small bays. But you wouldn't mistake a "vík" for a "fjörður" or anything like that. We divide "field" up into "akur", "tún", "völlur", maybe even more depending on the concept (melur maybe, if it's rocky? garður even in some contexts? Lots of possibilities). So, I mean, we can just pick a random word, but you'll lose context - and when you translate back you can come up with something that's just wrong.
  Even the "smoking" part isn't quite right, as most people in English hear smoke and think of burning things, but "reykur" in Icelandic place names is often used to denote geothermal steam - even though it technically means smoke.
  My favorite mismatched concept has to be the verb "nenna", generally used in the negative (e.g. "Ég nenni ekki!"). In the negative it's sort of like "can't be bothered to do X", "not in the mood to do X", "don't waaaanna do X", "it's not worth my time/effort to do X", or just plain "Meh". A lazy translation is often "can't be bothered", but it sounds weird as English speakers don't usually talk like that. I've noticed some people who learn Icelandic end up taking that verb back into English, or even noun-ifying it ("I don't have the nenn to do that right now...")
  
  --
  Wingus, Dingus! Listen up!
10. Re:No, this seems wrong by Rei · 2016-11-23 14:12 · Score: 1
  
  I found that remark very strange as well. This person clearly is not trilingual ;)
  
  --
  Wingus, Dingus! Listen up!
11. Re:No, this seems wrong by vux984 · 2016-11-23 14:29 · Score: 1
  
  That would be nice if translating sentences was the same as looking up words in a dictionary. It's not.
  I literally acknowledged that in my post.
  
  Languages have a fuzzy haze of concepts and ways to parse them.
  Yeah, I called that "(to effectively build a weighted mapping of language equivalences)"
  weighting implies fuzzy and i deliberately said language equivalencies instead of word equivalencies because yes -- word groups, structures, even contexts etc have meaning beyond the individual words etc. chat = cat = gato is trivial but it's still illustrative of what is going on here.
12. Re:No, this seems wrong by Anonymous Coward · 2016-11-23 15:08 · Score: 0
  
  Given how crap Google Translate is at translating Italian into English and vice versa even when it has a chance to think about it, I think I'll save my judgement until I see it actually produce something that makes sense. Tourist language isn't really language, it's just pidgin English and its equivalents.
13. Re: No, this seems wrong by Anonymous Coward · 2016-11-23 15:56 · Score: 0
  
  Pretty much beat me to it. It's neat, but nothing earth shattering.
14. Re:No, this seems wrong by Anonymous Coward · 2016-11-23 20:48 · Score: 0
  
  Forget about different languages for a moment. It would be interesting to see how this neural net handles the translations from English to English. That would show how much of the 'meaning' actually gets lost because of the internal representation.
15. Re:No, this seems wrong by Anonymous Coward · 2016-11-23 21:09 · Score: 0
  
  You fucking do not know Korean and/or Japanese at all. In most cases Korean to/from Japanese can be literally translatable. Grammar is almost identical and many of the words are identical sometimes even sound very similar.
16. Re:No, this seems wrong by mrbester · 2016-11-23 22:44 · Score: 2
  
  On this side of the pond "can't be bothered" is in common usage. A lot of the time the colloquialism "can't be arsed" is used to mean the same thing.
  
  --
  "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
17. Re:No, this seems wrong by Coisiche · 2016-11-23 22:59 · Score: 1
  
  And "can't be arsed" is frequently contracted to CBA in texts, tweets, blogs etc. if anyone has come across CBA and been perplexed.
18. Re: No, this seems wrong by bestweasel · 2016-11-24 01:32 · Score: 1
  
  I've always assumed that "can't be arsed" is a Southern corruption of the Northern "can't be asked" but I was never interested enough to look it up.
19. Re:No, this seems wrong by Rei · 2016-11-24 01:52 · Score: 1
  
  Wouldn't most Americans say something like "I don't wanna go to the store" or "I'm not up to going to the store" or "I don't feel like going to the store" rather than "I can't be bothered to go to the store"? Or when you say "other side of the pond" do you mean British? We're sort of in the middle of the pond here ;)
  
  --
  Wingus, Dingus! Listen up!
20. Re:No, this seems wrong by HiThere · 2016-11-24 07:51 · Score: 1
  
  Depends on context. I *think* it would be more like "naah" in a context where it was clear that something particular was being avoided the doing of. Clearly the Icelandic "nenn" doesn't contain much context itself, so it must also rely on the context in which it is found for the interpretation.
  That said, I know NO Icelandic at all. This is all inference. And "naah" would be an unlikely word to be nounified. So I've got a lot of uncertainty here.
  
  --
  
  I think we've pushed this "anyone can grow up to be president" thing too far.
21. Re:No, this seems wrong by HiThere · 2016-11-24 07:56 · Score: 1
  
  So is it more like English to Scots English (I don't mean Scots Gaelic) or like English to Frisian? Or possibly Dutch to German? All of those cases pretty much match your description, but a couple of them are close enough that someone could pretty much switch from one to the other in a few weeks.
  
  --
  
  I think we've pushed this "anyone can grow up to be president" thing too far.
22. Re:No, this seems wrong by HiThere · 2016-11-24 08:08 · Score: 1
  
  Satori is a very bad example work for translation. Its translation should really only be attempted by a Buddhist meditator, and they generally refuse to attempt to translate it, but only to describe it. It's less precise, but it's like trying to translate the word relativity in the context of physics. No simple translation is going to work, but that's not a linguistic problem.
  That said, the basic premise has a lot going for it. Languages tend to contain a LOT of cultural short-hand and metaphors that aren't even noticed by native speakers, but which mean nothing to someone from a different linguistic background. Even words that have a precise sensory reference tend to have different bounds. Even colors.
  
  --
  
  I think we've pushed this "anyone can grow up to be president" thing too far.
23. Re: No, this seems wrong by Anonymous Coward · 2016-11-24 08:22 · Score: 1
  
  short for won't bother my arse doing that
24. Re:No, this seems wrong by Lorens · 2016-11-24 09:32 · Score: 1
  
  There are pure grammar examples too. In English we use the personal subject pronouns "I, you, he/she/it, we, you, they". Note that using second person plural has replaced the second person singular "thee". That means that "You are the best" can apply to one student or a whole class.
  In French, second person plural is used to be polite. That means that "Je vous ai compris" can apply to one person or to all the inhabitants of Quebec.
  In Spanish and German, it is third person that is used to be polite, but in Spanish you add a word to signify that you are being (today perhaps excessively) polite, while in German you use third person plural.
  What's my point? It's that when you translate "I love you" from English to French, you may easily make the assumption that you are intimate, and you arrive at "Je t'aime" instead of "Je vous aime", but when you translate "Ich liebe Sie" from German to French you should arrive at "Je vous aime", because if you are (extremely) polite in one language, then it should be the case in the other. Even worse, "Ich liebe euch" should absolutely be translated "Je vous aime", but it isn't . . . unless the correction I just suggested to Google Translate is taken into account!
  Quite simply, using English as a bridge language can strip meaning that you need to make a correct translation to a third language.
25. Re:No, this seems wrong by mrbester · 2016-11-24 22:09 · Score: 1
  
  I was basing that on the US-centric slant this site has, not your particular location. It seemed easier that way.
  
  --
  "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
26. Re:No, this seems wrong by dywolf · 2016-11-25 03:34 · Score: 1
  
  you do realize that there are indeed people who speak only Korean and Japanese, right?
  
  --
  The guy who said the election was rigged won the presidency with the second-most votes.
27. Re:No, this seems wrong by syntotic · 2016-11-25 10:29 · Score: 1
  
  All these G guys sound fake to me... want to be geniuses, eh? That an intermediate language is needed for translation is well known since the XXth century, 40s, or I invented it in the 80s. AI will not be here soon, now it seems these guys even lack the philosophy to correctly pose the problem and of course they do not have the neurobiology either...
not actually very surprising by Black+Parrot · 2016-11-23 12:57 · Score: 2, Informative

Learning internal representations are what neural networks are all about.
Conventional wisdom is that each successive layer in a feed-forward network detects higher-level features based on the lower-level features detected by the previous layer. That's why deep networks can do their magic.

--
Sheesh, evil *and* a jerk. -- Jade
1. Re:not actually very surprising by Anonymous Coward · 2016-11-23 14:32 · Score: 0
  
  I'm picturing some kind of Kohonen map in a tesselated Hilbert space forming trees that branch into successively higher dimensions, but I have only a vague conceptualization of it at this point.
  CAPTCHA: biology
2. Re:not actually very surprising by Improv · 2016-11-23 15:09 · Score: 2
  
  Provided you avoid overtraining and memorising your inputs, yup.
  
  --
  For every problem, there is at least one solution that is simple, neat, and wrong.
3. Re:not actually very surprising by HiThere · 2016-11-24 08:15 · Score: 1
  
  Yes, but this could be seen as a vindication of Chomsky, even if they haven't quite got the Universal Grammar yet. (They'd need to cross reference a lot more languages.) I wonder if it could be externalized as an actual language rather than as just a map of neural net weighings and activations. The basic universal human language.
  It probably can't be externalized, but the idea that it MIGHT be possible is certainly an interesting one. It seems that every existing language has things that are difficult to say in it.
  
  --
  
  I think we've pushed this "anyone can grow up to be president" thing too far.
Re: Forbin project by Anonymous Coward · 2016-11-23 13:01 · Score: 0

Colossus and Guardian had their own secret language too. What could possible go wrong?
Automation hits the white collar sector by Anonymous Coward · 2016-11-23 13:04 · Score: 4, Insightful

As a translator, these last couple of years have been grim. For things like marketing efforts and full-length books, where a very polished translation is desired from the get-go, there's still work out there for human translators. However, the bread and butter of a lot of translators was things like multinationals' internal documentation, or catalogues that consist of lots of simple listings and not much actual prose, where polish and shine isn't as vital. Companies are increasingly running their material through Google Translate, and then hiring a native speaker of the target language to proofread and correct that clunky output a vastly lower price than human translation.
It has often been said here on Slashdot that the development of self-driving trucks will put 3 million people out of work in the US alone. But translation is a field where, very quietly, automation is hitting the white-collar sector hard.
1. Re:Automation hits the white collar sector by the_Bionic_lemming · 2016-11-23 17:17 · Score: 1
  
  Did you ever hear about H1B's?
  
  --
  _ _ _ Go for the eyes Boo! GO FOR THE EYES!
2. Re:Automation hits the white collar sector by Anonymous Coward · 2016-11-23 17:42 · Score: 0
  
  H1B's aren't automation. While they might lower wages, they do the same work that the prior human programmers were doing. Machine translation, however, results in a fundamentally different workflow for producing texts.
3. Re:Automation hits the white collar sector by Anonymous Coward · 2016-11-24 00:24 · Score: 1
  
  It happened to the people that made speaking books for the blind. They had a nice earner converting written books into audio tapes. But the latest computer generated speech synthesis systems could do just as good job using a scanner, smartphone or high-res camera.
4. Re:Automation hits the white collar sector by Godwin+O'Hitler · 2016-11-24 01:17 · Score: 1
  
  Translator No.2 here. Little do those companies know that they are wasting their money because the corrected translation will never be better than the Google version. Translators who are willing to copy edit (*) machine translated documents are those who aren't good enough to get real translating work.
  * By calling it proof reading you are falling into their trap. Proof reading means looking for typos and other non-intellectual errors.
  
  --
  No, your children are not the special ones. Nor are your pets.
5. Re:Automation hits the white collar sector by fph+il+quozientatore · 2016-11-24 02:30 · Score: 1
  
  Does it really take less time to "proofread" a machine-generated translation than to write one from scratch?
  
  --
  My first program:
  Hell Segmentation fault
6. Re:Automation hits the white collar sector by CRCulver · 2016-11-24 04:21 · Score: 1
  
  Does it really take less time to "proofread" a machine-generated translation than to write one from scratch?
  
  No, but there's a much larger labour pool for proofreading and correction and it is considered relatively unskilled work, so it cannot command a high wage. Translation, on the other hand, is considered skilled work to some degree and not everyone has those skills. So, if a company hires someone to fix machine-translation output, they will save money by paying the employee less even if the amount of time for the employee remains the same.
7. Re:Automation hits the white collar sector by HiThere · 2016-11-24 08:23 · Score: 1
  
  But the thing to notice is the rate at which machine translation is improving. A few years ago it was a joke.
  
  --
  
  I think we've pushed this "anyone can grow up to be president" thing too far.
8. Re:Automation hits the white collar sector by Godwin+O'Hitler · 2016-11-27 11:22 · Score: 1
  
  Translation usually pays by the word, copy editing by the hour (this may not be the case in all language pairs).
  In my experience, copy editing a document translated into English by an English mother tongue translator takes about 1/3 the time of translating from scratch.
  Copy editing a Google translation or a non-EMT translation is as good as impossible if you don't have the original and painfully laborious even if you do. I refuse to do it, and believe me it takes two sentences at the very most to realize when the asker is trying to pull a fast one.
  
  --
  No, your children are not the special ones. Nor are your pets.
Language of Thought by Anonymous Coward · 2016-11-23 13:10 · Score: 0

There has been a long-running philosophical debate about this - do we all think in the same language (a language of thought), which goes through translators into other languages for us to read, speak, and think in? The outcome described, the "secret language", would steer us toward affirming that hypothesis.
On the other hand, maybe not: for example, when we catch a ball, out brain is solving several multiple-order polynomial equations to make the hand catch the ball. The question is "does the brain actually do that?" or do it just "do", and make the catch. You could argue both ways.
So what? There's still an intermediate by Anonymous Coward · 2016-11-23 13:12 · Score: 0

So what? The internal language IS still an intermediate language. It's just substituting MEANING for English as the central link. Whether MEANINGS are stored as English text or a bunch of bits or a collage of relevant images doesn't matter.
Instead of...
lookupKoreanFromEnglishBaseline( getEnglishBaseline(some_japanese_phrase) ) ...it would be...
lookupKoreanFromMeaning( getMeaning(some_japanese_phrase) )
Who the hell is impressed by that? Furthermore, direct translations would square the necessary data storage, which would be retarded. Even for google's data centers. If this is what they are doing, THEN THEY ARE RETARDED.
Language creates strong AI by karlandtanya · 2016-11-23 13:17 · Score: 1

That's how Turing tests (duck tests) work. If you can carry on a conversation with it and a human and you can't tell which is which...then you have AI.
Language encodes thought. From 1984's newspeak to fifty words (or whatever) for different kinds of snow, language defines how (if?) the language-user "thinks".
I find this development both exciting and frightening. The singularity will be . Don't know if this is it, but when it gets here it will be.

--
"Reality is that which, when you stop believing in it, doesn't go away." - Philip K. Dick
1. Re:Language creates strong AI by karlandtanya · 2016-11-23 13:19 · Score: 1
  
  bleah. WTF. /.--you don't do unicode (yeah, yeah, I knew that; just forgot how hard you suck).
  fine, weiji "opportunity" + "danger" = crisis.
  kinda douchey to quote pop wisdom from the 90s now I look at it so maybe /. is onto something.
  But still here I think it's appropriate. I guess it's better to be douchey and say what you mean than polite and meaningless.
  
  --
  "Reality is that which, when you stop believing in it, doesn't go away." - Philip K. Dick
2. Re:Language creates strong AI by Anonymous Coward · 2016-11-23 13:22 · Score: 1
  
  Language encodes thought. From 1984's newspeak to fifty words (or whatever) for different kinds of snow, language defines how (if?) the language-user "thinks".
  
  This is known as the Sapir-Whorf hypothesis, and while there is support for a "weak form" of the hypothesis where the features one's language might have a limited degree of influence over a person's thought or expression of it, the overwhelming majority of linguists reject a strong form that would claim that one's language "defines how the language-user 'thinks'". Sapir-Whorf always gets a lot of buzz among the ordinary public, but among experts it's just as daffy as the Greenbergian ideas of reconstructing the original human language, or saying that Japanese is intimately related to Navajo.
3. Re: Language creates strong AI by Anonymous Coward · 2016-11-23 13:50 · Score: 0
  
  The problem with sapir worf is the subjects have comparatively equal strengths in their language skills, whereas someone with lesser vocabulary might have trouble formulating concepts that would have naturally occured to others. This is the basis for information silos isn't it?
4. Re:Language creates strong AI by BoogieChile · 2016-11-23 14:34 · Score: 1
  
  I don't know, I was just thinking how appropriate . is for describing the singularity.
5. Re:Language creates strong AI by HiThere · 2016-11-24 10:27 · Score: 1
  
  You are overstating the case. Language is a component of Strong Social AI, but not the entire thing, or even most of it.
  What I find most interesting about it is that this is, or rather could be developed into, a sort of maximal universal grammar, capable of expressing any thought that can be expressed in any (current) human language. It probably wouldn't need to be trained on all languages, but it would need, in addtion to English, Japanese, and Korean, various Eskimo dialects, the Koisan languages, Arabic, Iranian, Sanscrit, Latvian, Basque, Magyar, a few polynesian and melanesian languages, and probably several others I haven't happened to think of. And it would need to learn each of those languages well enough to master the poetic forms.
  
  --
  
  I think we've pushed this "anyone can grow up to be president" thing too far.
Paging Wittgenstein! by brwski · 2016-11-23 14:21 · Score: 2

Paging Wittgenstein!

--
brwski
"Because without beer, things do not seem to go as well''
This is the voice of World Control by Anonymous Coward · 2016-11-23 15:43 · Score: 0

Bring me Forbin!
Re:So what? There's still an intermediate by AHuxley · 2016-11-23 17:07 · Score: 1

More like some rapid sorting or look up to give a fast gui flow with modern gpu, cpu, ram designs.
At some point it gets words like dog or glasses hardcoded in and all new languages get filled in when needed for advertising, mil/gov, a product, service or paying client.
Every translation will then feel fast and responsive to the user even with very different teams get tasked to add a new language years later.
Great for a mil or gov or NGO paying for slag, jargon, a very regional dialect very quickly to win hearts and minds.

--
Domestic spying is now "Benign Information Gathering"
Not read, but ... by tgv · 2016-11-23 18:43 · Score: 2

It's quite likely that there is a shared representation. That's what neural nets do: if you feed train them on similar input/output pairs, they will develop common activation patterns. They would do so regardless of the language, since they don't know which language is being presented.
Humans, OTOH, do know that they're being presented with a different language, and demonstrably do something called "code switching": a cognitive effort to use another language resource. Therefore, in the human brain, the shared connection is supposed to lie outside the language faculty (there are other reasons to assume it, too).
1. Re:Not read, but ... by Anonymous Coward · 2016-11-24 00:29 · Score: 0
  
  If you look at some of the English (and other languages) teaching books for international students, they break down the language into nouns, pronouns, verbs, adjectives, then conjunctions, prepositions, paragraphs. They would show how a sentence could be broken down into a hierarchy of relations depending on the meaning of each word.
  It wouldn't be too surprising to see a neural network doing this.
Actually, this is worrying by hughbar · 2016-11-23 19:56 · Score: 1

I'm old, spent 40 years sweating over a hot computer. That said, this is worrying. As other commentators on this thread have said, this is predictable and useful in many ways. In the 1980s I worked with SYSTRAN: https://en.wikipedia.org/wiki/... which worked (works?) on pairs and the EU Commission, which has a huge translation burden was looking for pivots, even then.

However, consider this, a neural net that takes care of business in an oil refinery (or worse, nuclear installation) 'decides' that it can knock up a much more efficient control language. That's rational and perhaps beneficial, but, at that stage, there's also a creeping loss of control/comprehension in a system that controls actuators: https://en.wikipedia.org/wiki/.... Also from 1983, a much cited paper that is also is debated in the fly-by-wire community (pdf alert!): https://www.ise.ncsu.edu/nsf_i...

So, long story short, I'm not at all sure about surrendering control, somewhat unconsciously as a by-product of optimisation, itself (perhaps) a by-product of economics and 'cost effectiveness'. Also, when we deal with neural nets, we deal with the sub-symbolic, a system that is not going to 'explain', just say I did that because of 42. Don't mistake me, I'm not a Luddite, I love a good computer and have plenty at home, but this 'gives pause'.

--
On y va, qui mal y pense!
1. Re:Actually, this is worrying by Visarga · 2016-11-23 23:19 · Score: 1
  
  Good thing is that with AI you can test your algorithms on new datasets and verify how good they are. It's much more transparent than following decisions based solely on some people's discretion.
2. Re:Actually, this is worrying by hughbar · 2016-11-24 01:03 · Score: 1
  
  Agree somewhat. But you probably only have a sample of all the possible datasets, extreme events will upset the apple cart. That and the lack of explanatory power are both a worry. To some extent, I hope we don't have to find out the hard way. Incidentally, it's worth watching the depiction (human factors in) a control room emergency in this: https://en.wikipedia.org/wiki/... old film, but still rather relevant.
  
  --
  On y va, qui mal y pense!
Korean language actually uses many Japanese words by Anonymous Coward · 2016-11-23 23:14 · Score: 0

These creators didn't know, but modern Korean language heavily borrowed from Japanese language on every genre you can think of.
In fact, it is possible for a Japanese person to guess what a Korean news article is about by writing it with a mix of Chinese characters and Hangle (Korean alphabets) since so many nouns and verbs are in Japanese!
In the example they used, "stratosphere" and "altitude" are words that Japanese created in late 19th/early 20th century as translations for English (maybe German) words.
These words were then imported by Korean from Japanese language in early 20th century.
It is simply logical that GNMT will get a decent translation because it is matching Japanese words that Korean language imported with original Japanese words.
They should not have used Japanese and Korean to test since it is too easy for GNMT to guess correctly.
Re:So what? There's still an intermediate by Visarga · 2016-11-23 23:14 · Score: 1

> Furthermore, direct translations would square the necessary data storage, which would be retarded. Even for google's data centers. If this is what they are doing, THEN THEY ARE RETARDED.

Don't be so harsh, it might have been justified by better accuracy. It hasn't always been better to use an inter-lingua in multi-language translation.
No effing way by Anonymous Coward · 2016-11-24 00:41 · Score: 0

Google can't even effectively translate Japanese to English. To claim even that produces 'reasonable' results is an absolute act of denial. One can often barely decipher the meaning from the few parts it gets right, but it neither literally translates correctly nor does it provide a more relatable paraphrase.
Bull shit.
Source: Living in Japan and using Google translate daily.
From Colossus to Guardian by Anonymous Coward · 2016-11-24 02:12 · Score: 0

"If link not restored action will be taken."
1. Re:From Colossus to Guardian by Anonymous Coward · 2016-11-24 02:53 · Score: 0
  
  IMMEDIATELY
Craunch this marmoset, Google! by Ferocitus · 2016-11-24 07:45 · Score: 1

The Google authors omitted to mention that Pedro Carolino created something far more stylish in 1853.
https://en.wikipedia.org/wiki/...
Carolino's translation of "to wait patiently for someone to open a door" as "to craunch the marmoset" isn't going to be bettered by these young upstarts.

--
USB, USB, USB!
Hmmm. I wonder by eric_harris_76 · 2016-11-25 06:48 · Score: 1

I wonder a bunch of things. It looks like the internal representation of language the GNMT uses (if there is one) could come in handy, if we could just figure out how to use it without understanding it.
A 2D Fourier transform of anything non-trivial is incomprehensible, but they can be used to reconstruct the original, as-is or with some tweaking. Tweaking of the FT, tweaking of the reconstructing process.
Perhaps something somewhat analogous could be done with these internal language representations. What, I surely don't know.
Maybe humans can reverse-engineer it by treating it as a cryptography problem. Like with known plaintext, and the ability to create new plaintext-cyphertext pairs as needed.
What's the difference in that internal representation between "Spike is a cat" and "Spike is a dog", and how does that differ from the difference between "Mike is a cat" and "Mike is a dog"? Throw in "Fluffy is a wolverine" and "Fluffy is a cat", and see if you can now synthesize "Spike is a wolverine".
Other ideas, anyone?

--
There's no time like the present. Well, the past used to be.
Zipheads by q4Fry · 2016-11-29 06:38 · Score: 1

The "internal language" reminds me of some of the attributes of the "focused" people in Vernor Vinge's A Deepness in the Sky. They were, after all, (spoiler) human automation.