Google's Computing Power Refines Translation

← Back to Stories (view on slashdot.org)

Google's Computing Power Refines Translation

Posted by kdawson on Tuesday March 9, 2010 @11:48AM from the throwing-silicon-at-it dept.

gollum123 sends an excerpt from the NY Times on how Google has taken a lead in language translation, in one of the company's few unqualified successes as it attempts to broaden its offerings beyond search. "...Google's quick rise to the top echelons of the translation business is a reminder of what can happen when Google unleashes its brute-force computing power on complex problems. The network of data centers that it built for Web searches may now be, when lashed together, the world's largest computer. Google is using that machine to push the limits on translation technology. Last month, for example, it said it was working to combine its translation tool with image analysis, allowing a person to, say, take a cellphone photo of a menu in German and get an instant English translation. ...in the mid-1990s, researchers began favoring a so-called statistical approach. They found that if they fed the computer thousands or millions of passages and their human-generated translations, it could learn to make accurate guesses about how to translate new texts. It turns out that this technique, which requires huge amounts of data and lots of computing horsepower, is right up Google's alley. ...Google's service is good enough to convey the essence of a news article, and it has become a quick source for translations for millions of people."

12 of 142 comments (clear)

Min score:

Reason:

Sort:

Converting that article from English to Chinese to by Rei · 2010-03-09 11:53 · Score: 5, Interesting

English, with Google Translate:
---
Google's rapid rise to the translation of business executives is a result of what Google released a complex problem, and its powerful computing power for reminding me. The data center, and its Web search, it may be now, when attacked with the network, is the world's largest computer. Google's machine translation technology is being used to push forward the limit. Last month, for example, it indicated that it was a combination of image analysis of the translation tools to enable a person, says that while walking in the German mobile phone menu, photos and immediately the English translation. ... In the mid-90s, researchers began to favor a so-called statistical methods. They found that if they ate the computer or hundreds of thousands of millions of paragraphs and the translation of humans, it can learn how to make an accurate translation of the new text of speculation. Facts have proved that this technology requires large amounts of data and a lot of computing power, is the right of Google's alley. ... Google's service is sufficient to convey the essence of news articles, it has become a quick translation of millions of people everywhere.
---
Okay, perhaps not spectacular... but compared to Babelfish:
--- ...Is anything the prompt possible to occur to the translation business's crown trapezoid's Google quick rise, when Google unties it when the complex question violence computing power. Perhaps the data central network it for the net search establishment now is, when attacks together, world large-scale computer. Google uses that machine to push in the translation technology limit. The previous month, for example, it said that it operates and the image analysis unifies its translation tool, allows the human to adopt a menu the handset picture and obtains one with German immediately English translation. ... in the mid-1990s, researcher started to favor the so-called statistical method. They have discovered that if they have fed the translation which the computer thousands or the tens of thousands of paragraphs and their person cause, its possibly academic society does about what kind of guesses translator accurately the new text. _ it this technology, requests the huge large amount data finally and completely the calculated horsepower, is correct Google the alley. ... The Google service is enough good expresses the news article the essence, and it has become translation quick origin tens of thousands of people
---

--
Stale pastry is hollow succor to one who is bereft of ostrich.
Not from NY Times by Anonymous Coward · 2010-03-09 12:01 · Score: 3, Informative

Last week's The Economist adressed this issue (http://www.economist.com/specialreports/displaystory.cfm?story_id=15557431). NY Times recycled it
Re:Converting that article from English to Chinese by Daniel+Dvorkin · 2010-03-09 12:02 · Score: 5, Insightful

Yeah, that's actually a pretty good test. Google's version is odd but comprehensible, while Babelfish's is a bunch of ... well ... babble.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Re:Similar languages by hardburn · 2010-03-09 12:07 · Score: 5, Funny

I've worked on payment processing for web sites in Korea before. The translations of error messages we get from the system, then passed through Google translate, are exactly as good as the translations we get back from a human translator. That is, not useful at all.

--
Not a typewriter
Re:Similar languages by MBCook · 2010-03-09 12:25 · Score: 4, Interesting

This seems like the ideal opportunity to mention Translation Party. You give it English, and it translates it to and back from Japanese until the input and output English are the same.
It can be a ton of fun.

--
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Re:Converting that article from English to Chinese by timeOday · 2010-03-09 12:27 · Score: 4, Interesting

OK, here is something better than a round-trip translation test.
Der Spiegel offers version of some of its stories in English. They aren't direct translations, but quite similar.
Here's part of a story published in english:

Those wanting to own a McDonald's or Subway franchise in Germany must be prepared to offer up intimate personal details, including health information. One German official says the questionnaires violate the law. ...
According to information obtained by SPIEGEL, those wanting to partner with the fast-food chain Subway must agree to a background check "in accordance with anti-terror legislation" such as the US Patriot Act.
The report must also include information about the applicant's character, lifestyle and relationships. Future franchise owners are also asked whether they have ever been part of a terrorist organization.

And the same story, published in German, translated to English by google:

McDonald's and Subway asking intimate data from franchisees
From its franchisees in Germany require the American fast food McDonald's and Subway deep insights into the intimate and the political convictions. Who wants to be partner of Subway, for example, must create an audit report in accordance with the anti-terror laws "such as the USA Patriot Act to agree." This report will contain information about "character", "lifestyle" and "relationships". The applicant shall provide information, even if she "ever directly or indirectly involved in terrorist activities were"

And babelfish translation of the same story:

McDonald' s and Subway demand most intimate data of franchise takers
Of their Franchise takers in Germany the American high-speed restaurant chains McDonald' require; s and Subway deep views of the privacy and the political convicition. Who for example partner of Subway would like to become, must the production of a test report " in agreement with the anti- terror Gesetzen" as " The USA patriot Act" agree. This report is information over " Charakter" , " Lebensweise" and " Beziehungen" contained. The applicants have to give even information whether them " ever at activities of terror beteiligt" directly or indirectly; were.

I do think the google version is significantly better.
Re:Their search parsing tech probably helps too by MichaelSmith · 2010-03-09 12:34 · Score: 4, Funny

But it also concluded that a hot dog was the same as a boiling puppy.
There is nothing wrong with that. My son forms connections like that all the time, and he is only slightly younger than google.

--
http://michaelsmith.id.au
Re:Converting that article from English to Chinese by spazdor · 2010-03-09 12:47 · Score: 3, Insightful

This doesn't actually mean the translation is any better: all it means is that the Chinese generated by Babelfish is more easily translated back to english, perhaps because it makes even less sense in Chinese. A translation function could be conceived which is a strict, reversible bijection, so that playing this translation game would give you your original English back, word-for-word. Doesn't guarantee that the intermediate Chinese step is in any way comprehensible.

--
DRM: Terminator crops for your mind!
Asian languages and vastly different grammar by penguinchris · 2010-03-09 12:53 · Score: 5, Interesting

Several others have noted this as well - for Asian languages, Google has a lot of work to do. The Chinese translation near the top is impressive, but while Chinese and Japanese translations are probably pretty good on Google, other Asian languages suffer greatly.
I've been translating a lot of Thai lately, and initially I thought Google was great - the interface is really slick, and it seemed to give a decent result. Passing the translation back through often gave me really weird stuff, but I was expecting that. So it was great, until I tried using it to communicate with someone in Thai - even for really, really basic stuff, often they had absolutely no idea. It was just way off.
While you can feed western languages through it and get great, usable results, for Asian languages besides Chinese and Japanese it's next to useless. I'm guessing there isn't much of an incentive for Google to focus on other Asian languages - for example, in Android 2.1 on the Nexus One there is no way to even install fonts for less-popular Asian scripts like Thai, much less inputting text in those scripts - despite this capability being available on certain other Android phones (you can install it on the Nexus One if you root it, of course).
Based on what their technique for learning translation is, though, hopefully this will improve over time. It's an impressive system as it is, but very much limited to "popular" languages and those very similar to English.
Re:Converting that article from English to Chinese by RavenousBlack · 2010-03-09 12:53 · Score: 4, Insightful

Not to disagree with the results of your test, but I think a better test would be actual translations from authentic Chinese text to English. Going from English to Chinese to English is like taking an English interpretation of what the Chinese are trying to interpret from what someone was saying authentically in English instead of just interpreting into English what someone was authentically saying in Chinese.
Re:Converting that article from English to Chinese by Jurily · 2010-03-09 14:45 · Score: 3, Insightful

A translation function could be conceived which is a strict, reversible bijection, so that playing this translation game would give you your original English back, word-for-word.
That's the main problem with translations: they're not strict, and sometimes not even reversible. In every language there are common phrases which make perfect sense to someone thinking in the language, but are untranslatable to the point where you as a translator just rephrase the whole sentence (example: "is right up Google's alley"). Then, if you get another translator to translate it back to the original language, you sure as hell won't get the original phrase back (assuming both translations are perfect in terms of understandability and conveying the message).
Then you have words that don't exist in the target language, like "brute-force" or "computing horsepower", or even concepts that don't exist.
I think the fact that we can understand machine translations is more a tribute to the error correction mechanisms in our brain than anything else.
Translation is hard for people. by Estanislao+Mart�nez · 2010-03-09 18:58 · Score: 3, Informative
Why can't software translate as easily as a human? Is it really that difficult to come up with a set of rules so things are worded correctly?
But translation isn't easy for humans, so there's no reason to expect it should be easy for computers.
Translating from one language to another, for a human translator, basically comes down to this:
1. Reading the source text and understanding it as deeply as possible.
2. Writing an "equivalent" text in the target language.
But the problem is that there is never unique "equivalent" text in the target language, but rather, a lot of alternatives that make different tradeoffs. This is because a foreign language is part of a foreign culture that has many concepts that are foreign to the source language, and likewise, the source language is part of a source culture that is foreign to the target language. So translators repeatedly find themselves in situations where either they must leave out something that the source text says or implies, or else say something unnatural in the target language to convey that information.
Comparing the grammar of dramatically different languages makes this really clear. For example, many languages have grammatical evidentiality, where statements are subject to grammatical rules that depend on the source of the speaker's information for the statement. So for example, a language where the equivalent to the sentence "Joe kicked Tom" required the verb to be conjugated differently depending on whether the speaker saw Joe kick Tom or heard so. If you had to translate an English text to a language like that, you'd have to decide, for each clause in the English text, who is the speaker of the sentence, and whether they know the event first-hand or second-hand, and either of those may often be unclear from the English text.
In the converse case, imagine if we're translating from a language like that into English. Then every sentence in the source language encodes some claim about how the speaker knows the information conveyed in that sentence. A completely literal translation, in which every English sentence had that information, would be extremely unnatural English writing. Leaving it out of every single sentence, on the other hand, might leave out something important to understand the text in some cases. So the translator has to decide in which cases the evidential conjugations of the source language must be translated into a longer English sentence than otherwise necessary.
This is one extreme example, but this sort of problem occurs at every level in translation. Translators often find themselves adding in information that the source text doesn't say, having to use circumlocutions in the target language to express really simple things from the source language, leaving out information from the source text has because it would be too cumbersome to phrase it in the target language, adopting strange conventions in the target language, or having to write supplementary materials to help the readers understand the translation (footnotes, introductions).
Or in a few cases, the translators write for people who don't know the source language but are familiar with some of the customs and concepts, or willing to learn them to understand the translation, and then they just leave untranslated words in. (Examples: lots of philosophy translations from German or French; anime fansubs that leave Japanese honorifics like -san or -sempai in, because the people who use them are anime fans, are at least a bit familiar with them, and actually understand more nuances that way.)
So, translation is not a mechanical task, and thus, there can't be a simple set of rules to do it. It's, as I said at the top, understanding a text in the source language, and writing another in the target language, tailored toward a different audience. And it requires understanding the audiences of the original text and the translation, and making many informal decisions based on that.
--
Are you adequate?