Slashdot Mirror


Google's Computing Power Refines Translation

gollum123 sends an excerpt from the NY Times on how Google has taken a lead in language translation, in one of the company's few unqualified successes as it attempts to broaden its offerings beyond search. "...Google's quick rise to the top echelons of the translation business is a reminder of what can happen when Google unleashes its brute-force computing power on complex problems. The network of data centers that it built for Web searches may now be, when lashed together, the world's largest computer. Google is using that machine to push the limits on translation technology. Last month, for example, it said it was working to combine its translation tool with image analysis, allowing a person to, say, take a cellphone photo of a menu in German and get an instant English translation. ...in the mid-1990s, researchers began favoring a so-called statistical approach. They found that if they fed the computer thousands or millions of passages and their human-generated translations, it could learn to make accurate guesses about how to translate new texts. It turns out that this technique, which requires huge amounts of data and lots of computing horsepower, is right up Google's alley. ...Google's service is good enough to convey the essence of a news article, and it has become a quick source for translations for millions of people."

142 comments

  1. Converting that article from English to Chinese to by Rei · · Score: 5, Interesting

    English, with Google Translate:

    ---
    Google's rapid rise to the translation of business executives is a result of what Google released a complex problem, and its powerful computing power for reminding me. The data center, and its Web search, it may be now, when attacked with the network, is the world's largest computer. Google's machine translation technology is being used to push forward the limit. Last month, for example, it indicated that it was a combination of image analysis of the translation tools to enable a person, says that while walking in the German mobile phone menu, photos and immediately the English translation. ... In the mid-90s, researchers began to favor a so-called statistical methods. They found that if they ate the computer or hundreds of thousands of millions of paragraphs and the translation of humans, it can learn how to make an accurate translation of the new text of speculation. Facts have proved that this technology requires large amounts of data and a lot of computing power, is the right of Google's alley. ... Google's service is sufficient to convey the essence of news articles, it has become a quick translation of millions of people everywhere.
    ---

    Okay, perhaps not spectacular... but compared to Babelfish:

    --- ...Is anything the prompt possible to occur to the translation business's crown trapezoid's Google quick rise, when Google unties it when the complex question violence computing power. Perhaps the data central network it for the net search establishment now is, when attacks together, world large-scale computer. Google uses that machine to push in the translation technology limit. The previous month, for example, it said that it operates and the image analysis unifies its translation tool, allows the human to adopt a menu the handset picture and obtains one with German immediately English translation. ... in the mid-1990s, researcher started to favor the so-called statistical method. They have discovered that if they have fed the translation which the computer thousands or the tens of thousands of paragraphs and their person cause, its possibly academic society does about what kind of guesses translator accurately the new text. _ it this technology, requests the huge large amount data finally and completely the calculated horsepower, is correct Google the alley. ... The Google service is enough good expresses the news article the essence, and it has become translation quick origin tens of thousands of people
    ---

    --
    Stale pastry is hollow succor to one who is bereft of ostrich.
  2. Not from NY Times by Anonymous Coward · · Score: 3, Informative

    Last week's The Economist adressed this issue (http://www.economist.com/specialreports/displaystory.cfm?story_id=15557431). NY Times recycled it

  3. Re:Converting that article from English to Chinese by Daniel+Dvorkin · · Score: 5, Insightful

    Yeah, that's actually a pretty good test. Google's version is odd but comprehensible, while Babelfish's is a bunch of ... well ... babble.

    --
    The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
  4. Try using google voice transcription by colin_faber · · Score: 1

    Voice to text attempt 1: "What is. Thank you. Hey Faber what I AM slot. People just want to let you know like Hello Colin, this is already the decision. I think it's going to ask." Voice to text attempt 2: " Hi. This is the level Dell Computers, I'm doing a follow, or on the error basement far start up top. If that happens. I still have the problem in 16 Keith dispatch number and I gave you so that into at that back and call us back and we could double shifts order with the problem. Thank you." The first one was silence that got recorded by accident, the second was from our favorite Indian's over at Dell computer, calling to pester me about how my repairs are going. =)

    1. Re:Try using google voice transcription by amRadioHed · · Score: 2, Insightful

      Yeah, google voice is fun, it's what you get when you combine voicemail and mad-libs.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
  5. Similar languages by Jurily · · Score: 2, Informative

    Sure, you might get something decent if you try to translate from English to German, but what about languages with entirely different thought models behind them, like Chinese or Hungarian? Last time I tried using it, it confused "has been" with "Latvian".

    1. Re:Similar languages by hardburn · · Score: 5, Funny

      I've worked on payment processing for web sites in Korea before. The translations of error messages we get from the system, then passed through Google translate, are exactly as good as the translations we get back from a human translator. That is, not useful at all.

      --
      Not a typewriter
    2. Re:Similar languages by Jurily · · Score: 1

      And what about countries where people actually speak English?

    3. Re:Similar languages by MBCook · · Score: 4, Interesting

      This seems like the ideal opportunity to mention Translation Party. You give it English, and it translates it to and back from Japanese until the input and output English are the same.

      It can be a ton of fun.

      --
      Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    4. Re:Similar languages by amRadioHed · · Score: 1

      Are you sure the human translators aren't just using google translate?

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    5. Re:Similar languages by zunger · · Score: 2, Interesting

      Are you sure that the error messages are even meaningful in Korean?

    6. Re:Similar languages by hardburn · · Score: 1

      No, I'm quite sure they aren't.

      --
      Not a typewriter
    7. Re:Similar languages by hardburn · · Score: 2, Interesting

      That is fun. Your sig breaks it.

      --
      Not a typewriter
    8. Re:Similar languages by Anonymous Coward · · Score: 0

      woosh

    9. Re:Similar languages by aixylinux · · Score: 1

      Pfft. That site doesn't work with Firefox 3.6, only IE.

    10. Re:Similar languages by istartedi · · Score: 1

      It doesn't work with my setup of IE or Chrome, the only two browsers I'm fiddling with these days. It must want you to totally drop your pants on security settings or something.

      --
      For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
    11. Re:Similar languages by rockNme2349 · · Score: 2, Funny

      Its strange, when translated from Korean, all they say is Whoosh!

      --
      Sewage Treatment Facilities - "Our duty is clear."
    12. Re:Similar languages by chadenright · · Score: 1

      It worked on firefox for me, try turning off flash block.
      However, it thinks "Who were you before you were who you are" has the same meaning as "Many people before the eyes of many people?"
      I think I broke it.
      http://translationparty.com/#6824917

    13. Re:Similar languages by hardburn · · Score: 1

      Not quite what I meant by "broke it". The phrase just changed back and forth between two different translations before the script gave up.

      --
      Not a typewriter
    14. Re:Similar languages by chadenright · · Score: 1

      Well, there's broken and then there's broken. Check out what it does to the phrase "Be evil".

    15. Re:Similar languages by Mr.Radar · · Score: 1

      What it does to "Be evil!" is also pretty good.

      --
      What if this signature were clever?
    16. Re:Similar languages by bunkymag · · Score: 0

      Jeez.. that was fun, but its Japanese translation ability makes even Babelfish look like an absolute translation genius. Very, very, very, very basic.

    17. Re:Similar languages by Anonymous Coward · · Score: 0

      Try this one:

      http://translationparty.com/#6834010

      Awesome hilarity ensue.

    18. Re:Similar languages by Anonymous Coward · · Score: 0

      Works fine in Safari.

    19. Re:Similar languages by CopaceticOpus · · Score: 1

      Your sig also breaks it, unless one adds a period.

      If I enter your comment, it ends up commenting on your sig.

      Finally, the requisite phrase leads to messages of peace.

  6. Re:Converting that article from English to Chinese by Cassius+Corodes · · Score: 2, Interesting

    What's interesting is that there are a couple of sentences where babelfish is actually better than google and the rest is way off.

    --
    Control is an illusion, order our comforting lie. From chaos, through chaos, into chaos we fly
  7. Altavista's Babel Fish by Pojut · · Score: 1

    I remember using Altavista's offering back in the day...the results were shoddy at best. It could make anything sound like engrish :p

  8. Re:Converting that article from English to Chinese by jandoedel · · Score: 0, Redundant

    Well d'uh... that's why it is called babble-fish!

  9. Re:Converting that article from English to Chinese by timeOday · · Score: 2, Interesting
    I would call it a very rigorous test, since you can get by in a foreign country with far, far less expressiveness than it takes to read a news article. ("Where's the toilet?" "How much for this?" Or for DoD applications, "Stop or we'll shoot!")

    Plus, round-trip translation at least doubles the error compared to an actual application which would involve one-way translation (and probably more, since the "return-trip" translation is starting with a poor quality input). A much more fair test would be comparing a one-way translation, man vs. machine.

  10. Pffft... by plasticsquirrel · · Score: 2, Insightful

    For Chinese, just using a character dictionary is better because the translations in Google are so bad. Unfortunately, I must do this on a daily basis. Google is good at search, but cataloging the entire Web is a much easier job than learning Chinese.

    --
    Systemd: the PulseAudio of init systems
    1. Re:Pffft... by Anonymous Coward · · Score: 0

      oh hell no... As someone who has studied Chinese for 5 years and lived in China for 4 years this is a very very bad idea. Chinese English dictionaries are terrible...

  11. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    Perhaps the data central network it for the net search establishment now is, when attacks together, world large-scale computer.

    Is that thing writing a science fiction novel in it's spare time or something?

    I like how it's rooted out Google as the "net search establishment".

  12. somebody has to say this by zlel · · Score: 1, Insightful

    Granted that Art is not a field foreign to computing, translation is an art that is difficult to satisfactorily automate. It's not about getting the semantics right, or the meaning right, but to translate a piece of work into another cultural context for another person, is a bit like trying to read somebody's mind. The turing test for translation would probably be something like automatically translating a new contemporary musical into another language? IMHO that's more difficult than getting a computer to write its own musical. I believe there is a niche for automated translation, but even for the niche it's trying to fill, it's not good enough. Not especially in my part of the world where there is not only a diversity of languages, but also a great diversity in the language families from which these language take their characteristics.

    1. Re:somebody has to say this by Vintermann · · Score: 1

      Remember that the niche Google Translate is currently trying to fill is not doing your translation jobs for you, but letting you know the rough content of a text in a language you don't speak. For many languages and contexts (example: French newspaper articles) it is very, very good at this. For others (Ukrainian IRC logs) it is only slightly better than useless.

      --
      xkcd is not in the sudoers file. This incident will be reported.
  13. For western languages... by IANAAC · · Score: 2, Interesting
    For western languages, I have no doubt that this will eventually be a decent option for general text.

    Just not now. It still needs a lot of work.

    I'm in the translation business, and the general trend in internet communications such as websites, etc. at least, is to simplify the language being used.

    For specialized text, we're a long way off yet.

    1. Re:For western languages... by Anonymous Coward · · Score: 0

      If you get worse results with specialized text, either the text is crap or your corpus is crap. Specialized text is orders of magnitude easier to translate as contextual gaps are way smaller than in normal prose. Of course, if you or your clients are using Newspeak in order to avoid translator fees, you probably don't care enough about quality to build a decent corpus. Move to Google translate and forget about it.

    2. Re:For western languages... by greenguy · · Score: 1

      I'm also in the business, and frankly, I'm not impressed. Google Translator is a stopgap at best. A lot of posts here have said it's good enough for basic phrases, and that may be true, but how far is that going to get you? Great, you can read short phrases... assuming they're not too obscure, and that they're written correctly and legibly in the source language, and that there's not some double entendre going on, and that Google understands both the dialect being translated and your dialect, and so on...

      Basically, good translation requires a vast amount of context -- both within a given document and in the broader culture. Google can accrue billions of documents that are reasonably good translations, but it can't accrue their context. The very fact that it's lumped them all together strips out their context. And what's appropriate in one context is quite inappropriate in another. [Insert well-worn anecdote about the Spanish verb "coger" here.] This simply can't be automated, because the same translator works in a variety of contexts, and will make different decisions at different times.

      An obvious example: most languages have two or more levels of address, depending on social distance. English does not. English doesn't distinguish between the subject and object form, or even the singular and plural, in the word "you," and nearly every other language does. That means there are three independent decisions to make when translating that word, with two or three or four reasonable choices for each. And that doesn't count word order, or colloquial usage that wouldn't translate directly at all. That's all dependent on context.

      In short, professionals won't be in danger any time soon.

      --
      What if I do the same thing, and I do get different results?
    3. Re:For western languages... by Vintermann · · Score: 1

      Google can accrue billions of documents that are reasonably good translations, but it can't accrue their context.

      What makes you say that? I'm pretty sure Google Translate "remembers" whether its training data came from a newspaper article, a UN document or a Gutenberg book. Otherwise it would hardly be able to make as good translations as it in fact does. One problem they have is that some forms of texts (like UN documents) are heavily overrepresented in their corpus, while others (like informal dialogue) are almost non-existent. People are not likely to paste in a legal document, people are more likely to paste in what they are talking about at their Korean friend's Facebook wall. Yet it works OK.

      Google Translate does not aim to replace professionals, and if you're a professional it's pretty sad you aren't aware of that. Google translate is to help you understand a text in a language you don't know. If you need to do a translation for someone else, then Google's offering is not the translator, but the translator toolkit.

      --
      xkcd is not in the sudoers file. This incident will be reported.
    4. Re:For western languages... by hughperkins · · Score: 1

      > In short, professionals won't be in danger any time soon.

      Sounds like you're worrying about that though ;-)

      Technology changes pretty quickly. Give it twenty years or so.

      I remember when I tried drawing a 3d graph of z = cos ( x * x + y * y ) on my Sinclair ZX Spectrum, in 1982 or so. Each single pixel took roughly a whole second to plot!

      Now, you can draw such a graph in realtime, 50 frames a second, whilst rotating the whole thing with the mouse. 3D graphics in Doom and then Quake, and now Counter-Strike are increasingly realistic, and run at fluid frame-rates, and simply because the underlying engine - the CPU and GPU - got very very fast.

  14. Re:Converting that article from English to Chinese by timeOday · · Score: 4, Interesting
    OK, here is something better than a round-trip translation test.

    Der Spiegel offers version of some of its stories in English. They aren't direct translations, but quite similar.

    Here's part of a story published in english:

    Those wanting to own a McDonald's or Subway franchise in Germany must be prepared to offer up intimate personal details, including health information. One German official says the questionnaires violate the law. ...

    According to information obtained by SPIEGEL, those wanting to partner with the fast-food chain Subway must agree to a background check "in accordance with anti-terror legislation" such as the US Patriot Act.

    The report must also include information about the applicant's character, lifestyle and relationships. Future franchise owners are also asked whether they have ever been part of a terrorist organization.

    And the same story, published in German, translated to English by google:

    McDonald's and Subway asking intimate data from franchisees

    From its franchisees in Germany require the American fast food McDonald's and Subway deep insights into the intimate and the political convictions. Who wants to be partner of Subway, for example, must create an audit report in accordance with the anti-terror laws "such as the USA Patriot Act to agree." This report will contain information about "character", "lifestyle" and "relationships". The applicant shall provide information, even if she "ever directly or indirectly involved in terrorist activities were"

    And babelfish translation of the same story:

    McDonald' s and Subway demand most intimate data of franchise takers

    Of their Franchise takers in Germany the American high-speed restaurant chains McDonald' require; s and Subway deep views of the privacy and the political convicition. Who for example partner of Subway would like to become, must the production of a test report " in agreement with the anti- terror Gesetzen" as " The USA patriot Act" agree. This report is information over " Charakter" , " Lebensweise" and " Beziehungen" contained. The applicants have to give even information whether them " ever at activities of terror beteiligt" directly or indirectly; were.

    I do think the google version is significantly better.

  15. I noticed that they were using my web site by Anonymous Coward · · Score: 2, Interesting

    I have a web site where every page is available in English and German. When I tested Google's translation with it, I noticed that Google reliably translated one sentence in the opposite direction, i.e. from English to German when I had asked for a German to English translation: On every page in German, there is one sentence in English which leads to the corresponding page in English. Google's translator appeared to pick the translation right from that page, which of course has that sentence in German (leading to the German version of that page). Google doesn't do this anymore, but when I saw it, I realized that Google's translator did not at all "understand" what it was translating.

    1. Re:I noticed that they were using my web site by DollyTheSheep · · Score: 1

      Sorry, but that means exactly the opposite to me: Google Translation very well understood, that one sentence on every page is in a different language and "reversed" that sentence as well. Wouldn't have been possible, if Google Translate would "understand" exactly nothing about language.

    2. Re:I noticed that they were using my web site by Anonymous Coward · · Score: 0

      If Google recognized that that sentence is already in the target language, why would it translate it into the source language, which I probably don't understand when I'm using the translator for real? This question and the fact that Google no longer does the reverse translation lead me to the conclusion that Google's translator recognized that they were corresponding sentences, but didn't understand much else about them.

    3. Re:I noticed that they were using my web site by Anonymous Coward · · Score: 0

      You could translate all your pages using babblefish, and feed that to the google bot....

  16. Their search parsing tech probably helps too by Phat_Tony · · Score: 2, Interesting
    Wired recently had this article on Google's search algorithm, which mentioned how far ahead it was in parsing language for things like bi-grams to figure out what the meaning of the search was by "figuring out" the relationships between related words in a very human-like way. They have also built an impressive synonym system. These technologies, developed for search, strike me as really critical for good translation.

    An exerpt from the article:

    "People change words in their queries. So someone would say, 'pictures of dogs,' and then they'd say, 'pictures of puppies.' So that told us that maybe 'dogs' and 'puppies' were interchangeable. We also learned that when you boil water, it's hot water. We were relearning semantics from humans, and that was a great advance." But there were obstacles. Google's synonym system understood that a dog was similar to a puppy and that boiling water was hot. But it also concluded that a hot dog was the same as a boiling puppy.

    --
    Can anyone tell me how to set my sig on Slashdot?
    1. Re:Their search parsing tech probably helps too by MichaelSmith · · Score: 4, Funny

      But it also concluded that a hot dog was the same as a boiling puppy.

      There is nothing wrong with that. My son forms connections like that all the time, and he is only slightly younger than google.

    2. Re:Their search parsing tech probably helps too by Vintermann · · Score: 1

      Google Translate has become impressively baby-like lately. If you enter "I'm watching a movie", you get out "Jeg ser en film" in Norwegian, nice and correct. But if you enter the incorrect phrase "I'm watching a movi", you get out the creative response "Jeg ser på en filmdel" - I'm looking at part of a movie!

      Another: Ice cream in spanish is "Helado", and is translated correctly. But what do you get if you forget the H, and enter "elado"?

      "ais krihm"!! See for yourself :-)

      --
      xkcd is not in the sudoers file. This incident will be reported.
    3. Re:Their search parsing tech probably helps too by MichaelSmith · · Score: 1

      Google has read those jokes somewhere and is repeating them to you. I sense emergence.

      Of course google doesn't understand what it repeats to us, but I question the idea that we understand things any more than google does. There may be many non-human intelligences in the world, but google is the first really smart system designed to (process|comprehend) our languages.

      I wonder what happens if I dial MYCROFTXXX in google voice? Will google checkout issue a payment for an unlikely amount of money?

    4. Re:Their search parsing tech probably helps too by DollyTheSheep · · Score: 1

      Wired recently had this article on Google's search algorithm, which mentioned how far ahead it was in parsing language for things like bi-grams to figure out what the meaning of the search was by "figuring out" the relationships between related words in a very human-like way. They have also built an impressive synonym system. These technologies, developed for search, strike me as really critical for good translation.

      OK, so they introduced contextual knowledge (or "world knowledge" or "semantics" if you will) when they saw, that page rank and keyword based search didn't cut it for many search queries? Shouldn't that have come not as an afterthought but long before? I mean, how can anyone expect, that search would never involve some contextual knowledge to be succesful?

      My guess is, that Google of course knows this. What they do is to build up contextual knowledge through their own search engine, how people relate words to each other and not by imposing a predefined rule set or ontolgy beforehand like cyc

      .

    5. Re:Their search parsing tech probably helps too by Vintermann · · Score: 1

      You don't get it. It's not jokes, it's Google Translate's attempts at figuring out the meaning of "partial" words. Google has seen "movie", "movies", "moving", and concludes that the word "movi" must have some sort of meaning in the same cluster.

      It's similar to how my sister thought convenience stores were called "rønst", because our local store was called "Rønstad" after the man who owned it. The d is silent, so she heard "rønsta". "-a" is the definitive article ending for feminine nouns in Norwegian. So if the particular is "rønsta", the general must be "rønst", right? Not quite, but it was a reasonable inference. Thinking "movi" is an English word meaning "part of movie" is less grammatical, but nonetheless a plausible inference - it shows how the translator "thinks".

      But "ais krihm" is even more shocking, in my opinion. It actually translates a (probably accidental) misspelling of ice cream into a deliberate misspelling of ice cream.

      --
      xkcd is not in the sudoers file. This incident will be reported.
    6. Re:Their search parsing tech probably helps too by Anonymous Coward · · Score: 0

      pretty standard stuff within the Natural Language Processing community, except they do it large-scale.

      plus, these things are *not* used in translation systems.

  17. As a foreigner in Japan by Anonymous Coward · · Score: 0

    I use Google translate frequently, and the translations are not very good, but when you pair it with some basic knowledge of the idiosyncrasies in the Japanese language. I am at least able to get a basic understanding of the text. But in some cases the results are barely any better than the Babble-fish example above.

    Having some basic understanding of the Language, I can often divide the text into smaller pieces , which seems to improve quality.

    1. Re:As a foreigner in Japan by Archon-X · · Score: 1

      I guess it depends on the language a lot.
      I've found that japanese translations are often awkward, and you have to 'force' a correct translation by changing context, structure, etc.

      Alternatively, the french translation is very, very good, picking up subtleties of formal / informal speech, slang and abbreviations.

  18. Good thing /. didn't use the original NYT headline by Anonymous Coward · · Score: 0

    It was the lovely "Google's Computer Might Betters Translation Tool" (since changed in the HTML title to "Using Computing Might, Google Improves Translation Tool" and "Google’s Computing Power Refines Translation Tool" in the online heading):
    http://languagelog.ldc.upenn.edu/nll/?p=2169

    There's also some commentary about the article from Ben Zimmer at Language Log...
    http://languagelog.ldc.upenn.edu/nll/?p=2170

  19. Re:Converting that article from English to Chinese by spazdor · · Score: 3, Insightful

    This doesn't actually mean the translation is any better: all it means is that the Chinese generated by Babelfish is more easily translated back to english, perhaps because it makes even less sense in Chinese. A translation function could be conceived which is a strict, reversible bijection, so that playing this translation game would give you your original English back, word-for-word. Doesn't guarantee that the intermediate Chinese step is in any way comprehensible.

    --
    DRM: Terminator crops for your mind!
  20. Asian languages and vastly different grammar by penguinchris · · Score: 5, Interesting

    Several others have noted this as well - for Asian languages, Google has a lot of work to do. The Chinese translation near the top is impressive, but while Chinese and Japanese translations are probably pretty good on Google, other Asian languages suffer greatly.

    I've been translating a lot of Thai lately, and initially I thought Google was great - the interface is really slick, and it seemed to give a decent result. Passing the translation back through often gave me really weird stuff, but I was expecting that. So it was great, until I tried using it to communicate with someone in Thai - even for really, really basic stuff, often they had absolutely no idea. It was just way off.

    While you can feed western languages through it and get great, usable results, for Asian languages besides Chinese and Japanese it's next to useless. I'm guessing there isn't much of an incentive for Google to focus on other Asian languages - for example, in Android 2.1 on the Nexus One there is no way to even install fonts for less-popular Asian scripts like Thai, much less inputting text in those scripts - despite this capability being available on certain other Android phones (you can install it on the Nexus One if you root it, of course).

    Based on what their technique for learning translation is, though, hopefully this will improve over time. It's an impressive system as it is, but very much limited to "popular" languages and those very similar to English.

    1. Re:Asian languages and vastly different grammar by Anonymous Coward · · Score: 0

      It works by using existing translations to gather examples. More examples better translations. Popular means more examples, hence
      better translations...

    2. Re:Asian languages and vastly different grammar by Cyberax · · Score: 2, Informative

      Russian, Polish and Ukrainian translations are laughable as well.

      Even UkrainianRussian translation is mediocre, even though it's pretty trivial (other translators have almost 100% perfect translations).

      So, good job but still lots to do.

    3. Re:Asian languages and vastly different grammar by Anonymous Coward · · Score: 0

      Ah, from English / from Chinese to Japanese both sucks. I've just confirmed it. For years Chinese to English is known to work well as Europe and China are on the same continent and share basic structure of languages. Japanese is a slightly different stuff and we got a bit more to do......

      giving http://chinese.engadget.com/2010/03/09/samsung-prices-tl500-tl350-aq100-and-sl605-shooters/ will return unreadable Japanese, if I'd translate it to English that'd be something like:

      timesamsung TL500TL350AQ100 and SL605, yet to talked on price for telling everything, announced previous. Especially, also TL500, when walks like Ricoh Loewe System, Pana LX3, small size will, specification, like hot-shoe line of strong strobe. Someone, itch inside my mind, at very last what level of price curious about, saw this digital camera? current sentence will be given. 14300 of NT is about 449 dollars, asks price. TL350 349 dollars or so. Not only TL500 and TL350, double shake-reduction has RAW format.

    4. Re:Asian languages and vastly different grammar by amaupin · · Score: 1

      Several others have noted this as well - for Asian languages, Google has a lot of work to do. The Chinese translation near the top is impressive, but while Chinese and Japanese translations are probably pretty good on Google, other Asian languages suffer greatly.

      I have all but given up on Google's Japanese translation. Altavista (now Yahoo) 's Babel Fish is much more reliable when it comes to Japanese. Sometimes the Google translation is so wrong that I can't even understand how it came up with the response returned. At least with Babel Fish I can usually figure out where it missed an idiom or failed to choose the correct meaning of a certain kanji character.

    5. Re:Asian languages and vastly different grammar by Mcgreag · · Score: 1

      This because Google first translates everything to English and then to the target language. I first noticed this when I was trying to translate into Swedish a Chinese site what where selling Go game boards. The Swedish translation used the word "styrelse" which means "board of directors" instead of "bräde" which means "gaming board". Neither in Swedish nor Chinese are these words homonyms but they are in English. It did make for a funny read at least :)

    6. Re:Asian languages and vastly different grammar by egghat · · Score: 1

      Google Translate is 100% based on statistics, so there are no special algorithms for translating from one language to another. The translation gets better when Google has *a lot* (gogoool) of sentences in a pair of laguage and knows that they have the same meaning. If the language pair is Russian - Ukrainian or German - Swaheli it's almost guaranteed to fail.

      Artficial Intelligent god Peter Norvig (guess where he works) always says: We don't have better algorithms, we just have more data. And if they do not have enough data, well, then Google translates fails.

      There's a very interesting video of a presnataion done by Peter Norvig n YouTube. Highly recommended.
      Norvig - TODAY: Innovation in Search and Artificial Intelligence

      --
      -- "As a human being I claim the right to be widely inconsistent", John Peel
  21. Re:Converting that article from English to Chinese by RavenousBlack · · Score: 4, Insightful

    Not to disagree with the results of your test, but I think a better test would be actual translations from authentic Chinese text to English. Going from English to Chinese to English is like taking an English interpretation of what the Chinese are trying to interpret from what someone was saying authentically in English instead of just interpreting into English what someone was authentically saying in Chinese.

  22. Why is machine translation so difficult? by ArchieBunker · · Score: 1

    That's what I've never understood. Why can't software translate as easily as a human? Is it really that difficult to come up with a set of rules so things are worded correctly?

    --
    Only the State obtains its revenue by coercion. - Murray Rothbard
    1. Re:Why is machine translation so difficult? by slimjim8094 · · Score: 2, Informative

      Is it really that difficult to come up with a set of rules so things are worded correctly?

      Yes.

      Longer answer - computers are very bad at context and meaning. Take French to English - it would be one thing if words had the same exact connotations and grammar, and you could just do a find-replace. But, unfortunately, that's not the case. There are many words in French that - depending on context - have many different meanings. In mathematical terms, the mapping of French words to English words is not bijective, nor vice-versa. Take the French word bete - it most literally means "beast", but is often used to mean "stupid". How is a computer supposed to figure out which one to use?

      I just checked and Google Translate actually gets the connotation right, but it's a relatively simple example. Consider the French word "baise" - either kiss or fuck - and a more complicated example. Now... Google gets this right too (creepy!)

      In any case, the only to get perfect translation is to make the computer understand the relevant meanings and connotations of words and stylistic choices... How would you convey a Cockney accent, or Cockney phrasing, in Chinese? In short, you'd need an AI.

      --
      I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
    2. Re:Why is machine translation so difficult? by Jer · · Score: 1

      Just FYI - the "creepy" results you're getting off of Google probably indicate that there are a lot of French-language pages out there for Google to gather data on.

      My guess is that it's even better than that - there are a lot of French pages out there that have corresponding English pages that Google can mine for information. Anytime you can find parallel corpora the algorithms are going to do better. I'd guess that the French/English pairing is probably a fruitful one for Google's algorithms because of Quebec - forcing information to be available by law is going to generate a lot of data online for Google to learn translation from.

      (There are other reasons that French/English, German/English, and generally "Insert European Language Here"/English are going to be easier for machine translation algorithms to learn than other pairings. But more data is almost ALWAYS helpful for these kinds of algorithms. Parallel corpora are expensive to generate and it's always nice when you can find them for free on the web.)

    3. Re:Why is machine translation so difficult? by h4rr4r · · Score: 1

      Your sig is wrong,

      MS is doing that too. Nice OS you got there, you might infringe on some of our patents, how bout you pay us so we don't sue you.

    4. Re:Why is machine translation so difficult? by BZ · · Score: 1

      Yes, it really is that difficult. Consider this classic example in English:

          Time flies like an arrow.
          Fruit flies like a banana.

      There happen to be two ways to read the latter sentence. One is in a way analogous to the former one: the subject is "Fruit", the intransitive verb is "flies", and "like a banana" is an adverb phrase. The other way to read it is that the subject is the noun phrase "Fruit flies" , the transitice verb is "like" and the direct object is "a banana". Heck, this case is difficult for _humans_ to get "right" at times.

      There are various situations like that in which the meaning is ambiguous, but even worse are situations in which the concepts used are just nonexistent in one of the languages/cultures. For example, English names are made up of three parts, typically: first, middle, last. Russian names are also made up of three parts: given, patronymic, family. Russian family name is a good match for English last name. Russian given name is a good match for English first name. Russian patronymic is ... not really a match for English middle name (in that for example it's not up to the parents to choose it), but occupies a similar position in names, obviously.

      So when translating a phrase containing (patronymic) into English, what word you use to translate it really needs to depend on what's being said. If it's a technical discussion about the concept, translate as "patronymic". If it's a casual discussion about names, "middle name" may be appropriate. If the Russian text says that X addressed Y by name and patronymic then that _is_ what they did, but that happens to be the standard polite form of address. The equivalent English form is a Mr/Mrs/whatever followed by the last name, and the translation should reflect that.

      There are also often situations in which two phrases are technically the same in terms of denotation but have different connotations (or heck, just different emphasis on which words are important; compare "I saw him" to "I saw him/em" and note that such emphasis differences can be expressed with word order in many languages). Getting that right can be very difficult unless you really understand what's being said. Pattern matching might work if you've seen that exact pattern before (which is Google's approach), but even small differences in the surrounding sentence structure can totally change the meaning of the part you're trying to translate.

      I leave you with this short story (or rather stories):

          Jack was walking across the meadow when he saw a spring. The spring glinted in the
          sunlight, and he thought that he'd never seen something quite so beautiful. He bent
          down and...

          1) ... put it in his pocket.
          2) ... had a drink.

      How much lookahead in the translation is needed to translate the first sentence correctly? If the rest of the story is option 2 above, how do you know that he didn't just take a swig from his flask before picking up the bit of metal?

    5. Re:Why is machine translation so difficult? by XanC · · Score: 1

      But they can only do that because of state enforcement.

    6. Re:Why is machine translation so difficult? by Anonymous Coward · · Score: 0

      undoing accidental mod. damn the new interface.

    7. Re:Why is machine translation so difficult? by DavidShor · · Score: 1

      This is true. Also, a scary number of English idioms exist verbatum in French too. ("The grass is always greener on the other side --> "l'herbe est toujours plus verte ailleurs."). This is because a) Our language is descended from theirs, and b) We tirelessly work to steal phrases from each other.

  23. Pretty good and impressive as it translated by al0ha · · Score: 1

    Eier von Satan correctly - except for Augenballgroße which is essentially Eye-ball-large.

    --
    Did you ever wake up in the morning, with a Zombie Woof behind your eyes? -- FZ
    1. Re:Pretty good and impressive as it translated by RavenousBlack · · Score: 1

      except for Augenballgroße which is essentially Eye-ball-large.

      Actually, gross in this usage is referring to relative size in a way, so it would mean eye-ball sized.

    2. Re:Pretty good and impressive as it translated by al0ha · · Score: 1

      Right you are, thanks for noticing.

      --
      Did you ever wake up in the morning, with a Zombie Woof behind your eyes? -- FZ
  24. Re:Converting that article from English to Chinese by uglyMood · · Score: 2, Interesting

    In Philip K. Dick's obscure 1969 novel Galactic Pot-Healer, the characters play a game based on this very idea. They take common sayings and figures of speech, and feed them through several language-translation computers. The results are then sent to a friend, who attempts to figure out what the original phrase was.

    Sometimes when you're reading PKD you get the uncomfortable feeling he really could see into the future.

    --
    "No matter where you go, there you probably are." -- Buckaroo Heisenberg
  25. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    And that's why they called it Babelfish.

  26. Re:Converting that article from English to Chinese by glwtta · · Score: 1

    "I had a small house of brokerage on Wall Street... many days no business come to my hut, but Jimmy has fear? A thousand times no. I never doubted myself for a minute for I knew that my monkey strong bowels were girded with strength like the loins of a dragon ribboned with fat and the opulence of buffalo dung."

    --
    sic transit gloria mundi
  27. Chess translations by JordanH · · Score: 1

    If you are into chess, Google Translate opens up a whole world of chess blogs to you. I haven't used it extensively, but I was quite impressed with this translation.

    To the chess players out there, note how it picks up notation interspersed with the text. It's not perfect and seems to fall back into Spanish algebraic in odd places, but I think they are the only translation tool that even tries to do chess notation.

    I wonder if there are other "special purpose" translations that Google Translate attempts. It's pretty impressive to me that they even bother with the small chess blog reading public.

    Oh, Google Translate does a lot better job on the non-chess parts of blogs, too.

    1. Re:Chess translations by Vintermann · · Score: 1

      > I think they are the only translation tool that even tries to do chess notation.

      It's all the more impressive considering it must be wholly automatic - no way they are adding exceptions for chess notation.
      Similar impressive results: the name "John" is not translated, but in the context "The gospel according to John" it usually is. This is correct, biblical names are traditionally translated.
      If you say "I'll travel by airplane", airplane stays airplane. But if you say "I'll watch 'Airplane' (awful 80's comedy)", you will get the proper, non-literal translation of that title, at least into Norwegian.

      --
      xkcd is not in the sudoers file. This incident will be reported.
    2. Re:Chess translations by Aighearach · · Score: 1

      That's really sweet, I'm a chess player and I really appreciate you pointing out this resource.

      I don't think it needs a special purpose capability because it uses the web data in a more generic way and has lots of chess sites as data already.

  28. Re:Converting that article from English to Chinese by wtbname · · Score: 1

    They just need to do what video card manufacturers do to thwart your little test Mr. Man. Cheat in the translation code to recognize your test, and just regurgitate your original text.

    Then how would you choose the best translation software to buy???? Oh... it's free?

  29. How different is this from AI research? by Anonymous Coward · · Score: 1, Interesting

    Obligatory Chinese Room mention.

    If a translation engine grows strong enough to adequately translate the phrases "give us our daily bread," "sharks are predatory carnivores," and "the loan shark wants his bread," that implies a significant ability to contextually infer meaning. Could someone opine on (or point to a work exploring) how similar the task of building an accurate translator is to the task of building a competent, world-aware (if perhaps not absolutely Turing-quality) AI?

    1. Re:How different is this from AI research? by Archon-X · · Score: 1

      Google's french translations are very strong:

      - Give us our daily bread (unsure what the catch is w/ this phrase, but)
      - Donnez-nous notre pain quotidien

      - Sharks are predatory carnivores
      - Les requins sont des carnivores prédateurs

      - The loanshark wants his bread
      - L'usurier veut son pain

      All translated with the correct context

    2. Re:How different is this from AI research? by biryokumaru · · Score: 1

      The Chinese Room is stupid, because if I had a mathematical model of the human brain, I could calculate these kinds of ridiculous ideas just as easily as the dude with the book calculates Chinese. The logical extension of the Chinese Room is that no one thinks, which is a pointless conclusion.

      --
      When you're afraid to download music illegally in your own home, then the terrorists have won!
    3. Re:How different is this from AI research? by Anonymous Coward · · Score: 0

      No, "bread" in terms of the loanshark means money. So "pain" is the incorrect translation.

    4. Re:How different is this from AI research? by Anonymous Coward · · Score: 0

      Incidentally, the third sentence was the only one with a catch; the others were just framing it as an example. Bread in the first sentence means "food made from dough;" shark in the second sentence means "carnivorous marine animal." No tricks at all there; the most common definitions for each word are the contextually correct ones -- food and fish. "Loan shark" (with a space in my stated example) from the third means predatory moneylender; "bread," when taken in consideration of who seems to be desiring it, means money. That's what I was talking about. If a translation engine COULD interpret the difference between yeasty bread and economic "bread," the ability of that engine to meaningfully connect a slang term for money with a vulgar phrase referring to a moneylender would imply a non-trivial functional understanding about what money is and what moneylenders do, Hence, the Chinese Room mention.

      Now, I'm not interested in the metaphysical aspect of Searle's thought experiment; I'm an atheist and a fairly strict physicalist, and I hold that I am a biological computer, one with some properties that are incompletely documented by current science. I'm more curious about its implications for definitions of sentience. Let me ask more clearly: if a device can interpret written language well enough to understand context and nuance for translation purposes, how far removed is it from a truly sentient entity capable of holding rational discourse?

    5. Re:How different is this from AI research? by Anonymous Coward · · Score: 0

      "veut ses sous" might be better

    6. Re:How different is this from AI research? by Anonymous Coward · · Score: 0

      if a device can interpret written language well enough to understand context and nuance for translation purposes, how far removed is it from a truly sentient entity capable of holding rational discourse?

      It isn't. Perfect natural language translation is an AI-complete problem for the exact reasons you state.

      Now, will we have "good enough" natural language translation before we have strong AI? Certainly. For some purposes we're already there.

    7. Re:How different is this from AI research? by LanMan04 · · Score: 1

      Unless "pain" is slang for money in French too. Anyone?

      --
      With the first link, the chain is forged.
    8. Re:How different is this from AI research? by biquet · · Score: 1

      Nope, "pain" is is not used as slang for "money" in French. The use of "bread" to mean "money" in English comes from Cockney Rhyming Slang, where "bread and honey" == "money".

  30. From the Menu Example Given... by Anonymous Coward · · Score: 0

    Google can now track what I order for dinner. I feel so naked.

  31. what is the other side by Anonymous Coward · · Score: 0

    A huge surveillance infrastrure that can be used to monitor conversations in real time as they can be converted to text and searched for inference.

  32. Re:Converting that article from English to Chinese by Ihmhi · · Score: 1

    That's a whole lot better than it was a few years ago.

    They still need to work on their Japanese a good bit, though. Translating my first sentence from English to Japanese to English spit out:

    This is the way it is much more than a few years ago the entire

    .

    I believe they are getting very strong on the vocabulary and context clues bit but having a difficult time translating between different Subject-Object-Verb formats.

  33. Re:Converting that article from English to Chinese by rolfwind · · Score: 1

    Yeah, that's actually a pretty good test.

    I don't think so. Things get lost in translation with humans already. There are phrases I simply can't translate in languages I'm fluent in, idioms and the like. And when humans pass along information, it also gets distorted, simplified, and the like - language is a vague, flexible thing. So we're trying to give the machine a test impossible to pass, a Turing test where most of us don't even have any real experience how well a human would do it as a frame of reference.

    It would be better just to translate many pieces one time, both ways, and have a fluent bilingual judge the quality. Although, I agree, checking the Chinese/Japanese to English capability is a good test.

    My personal test was to take reviews off of amazon.co.jp and translate them and see how the translator fared. Babelfish is indeed a bunch of babble, while Google's translation is far from perfect (or even all that good), it's obviously better.

  34. Re:Converting that article from English to Chinese by Jurily · · Score: 3, Insightful

    A translation function could be conceived which is a strict, reversible bijection, so that playing this translation game would give you your original English back, word-for-word.

    That's the main problem with translations: they're not strict, and sometimes not even reversible. In every language there are common phrases which make perfect sense to someone thinking in the language, but are untranslatable to the point where you as a translator just rephrase the whole sentence (example: "is right up Google's alley"). Then, if you get another translator to translate it back to the original language, you sure as hell won't get the original phrase back (assuming both translations are perfect in terms of understandability and conveying the message).

    Then you have words that don't exist in the target language, like "brute-force" or "computing horsepower", or even concepts that don't exist.

    I think the fact that we can understand machine translations is more a tribute to the error correction mechanisms in our brain than anything else.

  35. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    I often find the English to/from Chinese translations is usually better than English to/from Japanese. Chinese characters have concise meanings vs. Japanese's character set which are based on sounds.

  36. Malay English dictionary by Anonymous Coward · · Score: 0

    I have found in for Malay-English translations, that initially, Googles translation was better than Dewan Pustaka Dan Bahasa (DPDB) ie the people in charge of developing the Malay language. Since then however, I have found more errors creeping into their translations. I wonder if somebody is poisoning the well because when I first used Google translate the 99% of the translations were accurate, and the 1% was an unknown word. In my last use of Google translate, 50% of the words were wrong (as opposed to being unknown).

  37. Re:Converting that article from English to Chinese by GF678 · · Score: 2, Informative

    Going from English to Chinese to English is like taking an English interpretation of what the Chinese are trying to interpret from what someone was saying authentically in English instead of just interpreting into English what someone was authentically saying in Chinese.

    Exhibit A: http://winterson.com/2005/06/episode-iii-backstroke-of-west.html

  38. Re:Converting that article from English to Chinese by FatdogHaiku · · Score: 1

    ..In every language there are common phrases which make perfect sense to someone thinking in the language, but are untranslatable to the point where you as a translator just rephrase the whole sentence (example: "is right up Google's alley"). Then, if you get another translator to translate it back to the original language, you sure as hell won't get the original phrase back (assuming both translations are perfect in terms of understandability and conveying the message).

    Then you have words that don't exist in the target language, like "brute-force" or "computing horsepower", or even concepts that don't exist.

    I think the fact that we can understand machine translations is more a tribute to the error correction mechanisms in our brain than anything else.

    Awl hour translate spume waffle. Ewe no other gender knot exchangeable!

    --
    You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
  39. Re:Converting that article from English to Chinese by AniVisual · · Score: 1

    Japanese is very unique in that it leaves out the subject, and sometimes object of a clause when the meaning is understood in context. This is, however, very frustrating for machine translators. In addition, Japanese has a topic for its sentences, which function is very ambiguous in an English language.

  40. Re:Converting that article from English to Chinese by biryokumaru · · Score: 1

    Soon the super karate monkey death car would park in my space... but Jimmy has fancy plans, and pants to match!

    Feel my scales donkey donkey donkey donkey donkey.

    Also...

    I stole a car! I mean, a sycamore tree...

    --
    When you're afraid to download music illegally in your own home, then the terrorists have won!
  41. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    I think this should become a new internet meme. Everytime anyone says anything about a new technology, just post a response that says "that reminds me of what Phillip K. Dick wrote about in his obscure short story / novel [madeUpName]"

  42. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    this is what is dimensional analysis for physic:
    given the maximum password length, determine the computing power of the provider:
    example gmail.com, hotmail.com and others
    tips: hashtables

  43. What does that mean? by ScrewMaster · · Score: 1

    in one of the company's few unqualified successes

    What does that mean? Google has had more successes in the online world than most of its competitors.

    --
    The higher the technology, the sharper that two-edged sword.
  44. Low-volume languages? by vampire_baozi · · Score: 1

    While this works well for the more widely-spoken languages (Western/European Languages, Chinese, Japanese), I suspect there is a massive drop-off for some of the less common languages, especially those for languages spoken in countries less connected to the internet. The article mentioned they feed the algorithm human translations from the EU and UN proceedings; what about less-common Asian languages, the Indian subcontinent languages, central Asian languages? The volume simply doesn't exist.
    Where the volume does exist, what about Russian and Korean, which are dominated by Yandex and Naver? Might be interesting to run a comparison, but unfortunately all the languages I speak are covered fairly well by Google at this point :(

  45. Try iSnapit and translate by billwallace · · Score: 1

    Try iSnapit on your iPhone or Android, and you can translate everything you see with a single click - and much more.

  46. Re:Converting that article from English to Chinese by Hipcatjack · · Score: 1

    I am definitely down with that meme. Ironically on a totally unrelated story, i just borrowed a book from my friend by P.K.D. talking about obscure authors suddenly finding themselves idolized superstars on par with the biggest Sports players or Rockstars. and their lives dealing with papparozzi - huh

  47. Re:Converting that article from English to Chinese by Hipcatjack · · Score: 1

    Google translate English>traditional Chinese>simplified Chinese> Italian >back to English from what i copied from my above post: I believe that with the Mei-Mei. Ironically, the story is completely independent, I just borrow a book from my friend from polycystic kidney disease, clear speech, suddenly found itself the greatest athletes in the worship of the stars or Rockstars. Their lives and to deal with mosquitoes - ah

  48. Forkbomb by Anonymous Coward · · Score: 0

    The translator can't seem to figure out exactly how many times the road has diverged...

    "two roads diverged in a yellow wood"
    http://translationparty.com/#6827987

    1. Re:Forkbomb by pszilard · · Score: 1

      The translator can't seem to figure out how many times the road has diverged...

      two roads diverged in a yellow wood"

      Interestingly if you enter proper sentences, e.g. at least use punctuation marks, it doesn't go nuts anymore. Only one "." makes this much difference: http://translationparty.com/#6834172

  49. Youtube? by Anonymous Coward · · Score: 0

    Now that they turned on automatic sub-titles for many youtube videos, how long until these subtitles can be read in any language?

    And how long until they are synchronized by a synthetic voice in any language?

  50. Forkbomb by jlintern · · Score: 1

    The translator can't seem to figure out how many times the road has diverged...

    two roads diverged in a yellow wood"

  51. Translation is hard for people. by Estanislao+Mart�nez · · Score: 3, Informative

    Why can't software translate as easily as a human? Is it really that difficult to come up with a set of rules so things are worded correctly?

    But translation isn't easy for humans, so there's no reason to expect it should be easy for computers.

    Translating from one language to another, for a human translator, basically comes down to this:

    1. Reading the source text and understanding it as deeply as possible.
    2. Writing an "equivalent" text in the target language.

    But the problem is that there is never unique "equivalent" text in the target language, but rather, a lot of alternatives that make different tradeoffs. This is because a foreign language is part of a foreign culture that has many concepts that are foreign to the source language, and likewise, the source language is part of a source culture that is foreign to the target language. So translators repeatedly find themselves in situations where either they must leave out something that the source text says or implies, or else say something unnatural in the target language to convey that information.

    Comparing the grammar of dramatically different languages makes this really clear. For example, many languages have grammatical evidentiality, where statements are subject to grammatical rules that depend on the source of the speaker's information for the statement. So for example, a language where the equivalent to the sentence "Joe kicked Tom" required the verb to be conjugated differently depending on whether the speaker saw Joe kick Tom or heard so. If you had to translate an English text to a language like that, you'd have to decide, for each clause in the English text, who is the speaker of the sentence, and whether they know the event first-hand or second-hand, and either of those may often be unclear from the English text.

    In the converse case, imagine if we're translating from a language like that into English. Then every sentence in the source language encodes some claim about how the speaker knows the information conveyed in that sentence. A completely literal translation, in which every English sentence had that information, would be extremely unnatural English writing. Leaving it out of every single sentence, on the other hand, might leave out something important to understand the text in some cases. So the translator has to decide in which cases the evidential conjugations of the source language must be translated into a longer English sentence than otherwise necessary.

    This is one extreme example, but this sort of problem occurs at every level in translation. Translators often find themselves adding in information that the source text doesn't say, having to use circumlocutions in the target language to express really simple things from the source language, leaving out information from the source text has because it would be too cumbersome to phrase it in the target language, adopting strange conventions in the target language, or having to write supplementary materials to help the readers understand the translation (footnotes, introductions).

    Or in a few cases, the translators write for people who don't know the source language but are familiar with some of the customs and concepts, or willing to learn them to understand the translation, and then they just leave untranslated words in. (Examples: lots of philosophy translations from German or French; anime fansubs that leave Japanese honorifics like -san or -sempai in, because the people who use them are anime fans, are at least a bit familiar with them, and actually understand more nuances that way.)

    So, translation is not a mechanical task, and thus, there can't be a simple set of rules to do it. It's, as I said at the top, understanding a text in the source language, and writing another in the target language, tailored toward a different audience. And it requires understanding the audiences of the original text and the translation, and making many informal decisions based on that.

  52. nokia had this years ago by dwater · · Score: 1

    I guess what Google is talking about must be something different because Nokia had s/w for the N95 that could take a picture of a Chinese menu and provide a translation in English.

    --
    Max.
  53. google skynet ? by Atreide · · Score: 2, Funny

    Wasn't Skynet used for translation
    before it decided for a better future for humanity ?

    --
    The world belongs to those who get up early. - I'm far from being the king of Earth then :-(
  54. Re:Converting that article from English to Chinese by wye43 · · Score: 1

    Its an easy, or perhaps entertaining test, not good test.

  55. Re:Converting that article from English to Chinese by Rocketship+Underpant · · Score: 1

    I recently did an evaluation for a translation agency on the state of current machine translation services. Since I translate Japanese to English for a living, that was the pair I was testing.

    Long story short, of the five services I tried that do Japanese-English MT, Google came out the worst. Yes, the worst. Mind you, all of them were terrible. None could produce grammatical English sentences, and most couldn't even translate basic things like Japanese dates properly.

    --
    He who lights his taper at mine, receives light without darkening me.
  56. Potential As A Learning Tool? by RavenousBlack · · Score: 1

    Lately I've been trying out using Google Translate to improve my German. Whenever I write a sentence that I'm not too sure about, I take my English version of it and translate it into German in Google to see how it compares. So far it's been useful in better understanding preposition usage and sentence structure. That is, if it's reliable enough.

  57. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    It's not called Babblefish without a reason.

  58. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    To be fair, your first post was just barely in English.

  59. Re:Converting that article from English to Chinese by Phoghat · · Score: 1
    I wrote a grade school sweet heart. I'm of Polish descent and so is she, so I decided to write the letter in Polish. I can speak Polish if the occasion demands but this occasion demanded more than I have knowledge of.

    She wrote m back and commended me on keeping up with the language. Since I used Google Translate, I guess they do a decent job of it.

    --
    Think of how stupid the average person is, and realize half of them are stupider than that.
  60. If you're learning a language by badzilla · · Score: 1

    then Google Translate (or for some things wordreference.com) are fantastic resources. I don't mind that large chunks of text get translated in a stilted way - if you just need to get the meaning of a short phrase then Google is so much faster and easier than a paper dictionary.

    --
    "Don't belong. Never join. Think for yourself. Peace." V.Stone, Microsoft Corporation
  61. it's better english, but a better translation ? by Anonymous Coward · · Score: 0

    of course statistical translations look better, because they optimise
    exactly that: frequent word sequences in the target language (and related to
    the input of course).
    the real question is how well do they match the input meaning.
    can you tell ?

  62. Re:Converting that article from English to Chinese by pszilard · · Score: 1

    I think this benchmark puts the bar a bit too high. First of all, a translator is not designed to produce invertible translations. Moreover, as the goal is to produce an understandable translation of a human-written text, the artifacts introduced by the machine-created translation are most probably magnified quite a bit with the second round of translation. Still, it's interesting to see that the Google algorithms actually do an OK job even in such an artificial benchmark.

  63. Re:Converting that article from English to Chinese by Anonymous Coward · · Score: 0

    I'm doing a language study,
    And i have to say, Google translation is the worst of any free web translation sites.
    Why are they so proud on such a poor translator ??
    Most people who speak more then one language find it to be funny how Google translates.

    I think for a start its good, has many languages available, now next work on the translation part.
    Improvements required on spanish french dutch german and chineese.. and i gues some more, like scandinavian etc.

    Babelfish has a different origin (started as free to everyone) but became later comercial (thanks to all who helped translating).
    I wouldnt even consider using babelfish these days.. there are much better sites.

  64. Not that different from AI in some ways by daemonenwind · · Score: 1

    There are 2 core problems with translating:
    1. Language requires a cultural frame of reference.
    You can see this in understanding humor in different societies. For example Monty Python is a product of a British perspective. The English language, as spoken in England, only works when you understand the culture behind it. For example, "daily bread" only works in western languages because of the shared Christian influence. In Japanese, for example, "daily rice" might bring up a similar understanding that "daily bread" doesn't carry.

    2. Language is a moving target.
    References keep changing, and a computer (or even a foreign-based translator) has a hard time keeping up. Think about what Tea Party meant just 5 years ago, as opposed to today.

    All this means that you're going to get a really good computer translator about the time you get a really good computer painter. Even the best of the given translations in the responses to this story aren't anything someone would want to publish as an example of good English usage - the only benefit is a moderate ability to get the point of the subject.

    Or to drive the point home, try passing Goethe through the translator and see if the English is as good as the German.
    That's the true test of a translator - can it retain excellence, beyond base meaning?

    example:
    Wenn ein Edler gegen dich fehlt, so tu als hättest dus nicht gezählt! ..... Er wird es in sein Schuldbuch schreiben und dir nicht lange im Gebet bleiben.

    becomes
    When the gentleman wanting against you, then do as you would not have counted's! ..... He will write it in his book of guilt and you do not stay long in prayer.

    Sure you get the idea, but the artistry is pure fail.

  65. Re:Converting that article from English to Chinese by Tubal-Cain · · Score: 1

    ...knot...

    I don't know of any text-to-text translation programs that mix up homophones.

  66. No, no, NO. by Estanislao+Mart�nez · · Score: 1

    Also, a scary number of English idioms exist verbatum in French too. [...] This is because a) Our language is descended from theirs, and b) We tirelessly work to steal phrases from each other.

    English is not "descended from French," period. It has a large proportion of vocabulary from French. To you that may sound like it makes it "descended from French," but as it turns out, that's simply irrelevant to language classification, and precisely for the reason you state as (b): languages borrow from others very easily.

    English has a relatively recent common ancestor with German, called Proto-Germanic, and a much more distant one with French, called Proto-Indo-European. To show the link to German, however, you must compare English and German words that come from Proto-Germanic. English words that come from French are noise in this comparison.

    In addition, the fact that idioms are shared between English and French isn't particularly because English speakers borrowed the idioms from French. It's more because English and French are both European languages whose speakers have a long history of cultural exchange, a history that actually goes further back than the languages themselves. There are a lot of idioms that are common to both languages because they both got them from a third source. The big common sources are the Greco-Roman classics, the Bible, Christianity. Also, every major work of prose from most West European countries since the Renaissance has been translated to the languages of the others and read extensively, so that yes, "I think, therefore I am" is as much of a stock phrase as "être ou ne pas être, telle est la question" is one in French.

    1. Re:No, no, NO. by DavidShor · · Score: 1
      If you want to be pedantic about it, you missed the part where French is, roughly speaking, the end result of a bunch of Germans trying to speak Latin. And how the Normans ruled england for a while. And how, for a huge period of time, the ruling classes all over Europe spoke Latin or French exclusively. It's just absurd to bring up Proto-Indo-European to make it sound like French is as close to English as Hindi.

      But, if you want to be technical about it, the correct thing to say is that English is closest related, chronologically, to Frisian, which is kind of like dutch/german.

      Just looking on a word basis, we have more words with Germanic roots then French/Latin ones. But grammatically, it's a bit of a wash. We lack German's complex case system, it's weird definite article system(30 definite articles!), and the whole agglutinative word thing. With French, the only real difference with English is the masculine/feminine and vous/tu distinction, both of which also exist in German. But that's just a hunch, I don't know how I could quantify that sort of thing(I'd love to hear your input).

    2. Re:No, no, NO. by Estanislao+Mart�nez · · Score: 1

      Just looking on a word basis, we have more words with Germanic roots then French/Latin ones.

      And as I said, this is irrelevant, one way or the other. The comparative method works by finding systematic sound correspondences between a core vocabulary of words that come from the proto-language. Loanwords later than the protolanguage must be discarded from the comparison sets, because you want to find systematic correspondences that cannot be explained in terms of chance or borrowing.

      But grammatically, it's a bit of a wash. We lack German's complex case system, it's weird definite article system(30 definite articles!), and the whole agglutinative word thing.

      ...and again, yes, English and German are significantly different, but their present-day differences are just not relevant, because they are the result of changes that happened after the proto-language diverged. The goal of the comparison, again, is to find systematic correspondences that cannot be explained in terms of chance or borrowing. Even though the present-day languages have changed substantially, you're looking for traces of the fact that they have the same origin.

      Also, your "agglutinative word" claim is very much off-base, for two reasons:

      1. You're getting hung up on the fact that German orthography doesn't use spaces between the parts of the compounds, whereas English ortography normally does. But really, the grammar of an English expression like "systems integration engineering process" is the same whether you write it with spaces between the parts or not ("Systemsintegrationengineeringprocess").
      2. The grammar for noun-noun and adjective-noun compounds in English and German is actually very similar--and at any rate, more similar to each other than either is to Romance languages like French or Spanish, where noun-noun compounds are very rare (the normal way to say that in Spanish would be something like "proceso de ingeniería de integramiento de sistemas," with prepositional phrases).
      3. In any case, neither German nor English is actually showing agglutination. Agglutination is a term that's used to refer to certain forms of inflectional morphology, whereas compounding (along with derivation) is a type of what linguists call word-formation.

      With French, the only real difference with English is the masculine/feminine and vous/tu distinction, both of which also exist in German.

      Irrelevant, once more. English and German both descend from Proto-Germanic, a language that underwent the sound changes described by Grimm's law. French did not.

      It is also very important to note that English actually used to have the same sort T/V distinction that French and German do, with thou vs. you. Again, the fact that English does not have that distinction today doesn't prove anything about its relationship to either French or German--and in this case because it's a that's true today but was not true even 500 years ago, much less 2000 years ago.

      Not to mention that there are countless other grammatical differences between French and English that you're discounting. I already gave one example--English has extensive noun-noun compounding, whereas French doesn't.

      But that's just a hunch, I don't know how I could quantify that sort of thing (I'd love to hear your input).

      Indo-European historical linguistics was mostly figured out by 1930. I don't really have anything to say that you can't find in the ample, 200 years old literature about the historical linguistics of Indo-European languages.

      I'll say one more things: linguistics is a very subtle topic, and it takes many years to really learn it. You

    3. Re:No, no, NO. by AthanasiusKircher · · Score: 1

      English is not "descended from French," period. It has a large proportion of vocabulary from French. To you that may sound like it makes it "descended from French," but as it turns out, that's simply irrelevant to language classification, and precisely for the reason you state as (b): languages borrow from others very easily.

      Um, yeah, but did you forget about the freakin' Norman Conquest? Being ruled by a bunch of French speakers for a couple of centuries completely changed the native English language. When all of your upper class speaks a foreign language, eventually it trickles down, not only in vocabulary, but in some places in grammar as well.

      Old English, which is the core of the English language, is not descended from French, but much of our Latinate vocabulary came through French.

      English has a relatively recent common ancestor with German, called Proto-Germanic, and a much more distant one with French, called Proto-Indo-European.

      To quote you, "No, no, NO." You're talking about language trees, and you're right about the core grammar and vocabulary of Old English as it got transferred to modern English. But you're missing a whole segment of the development of the language -- in fact, MUCH more recent than Proto-Germanic, since it dates from after 1066. That's the part of the language development that's relevant to many transfers of idioms, which is what the GP was talking about.

      Only between 20 and 30 percent

      of our modern English vocabulary comes from the Proto-Germanic branch. So, when you start talking about "descent," it really depends on what you mean. If you mean the source of the lexicon, a greater percentage comes from Latinate roots, and a lot of that comes through French.

      English words that come from French are noise in this comparison.

      Dude, do you even realize what you're saying? You're annoyed about someone using the word "descended." Fine. But the transfer of idioms like the one he was talking about are often a much more recent aspect of language than Proto-Germanic. In this case, the recent (as in the last millennium) appropriation of a bunch of vocabulary and idioms into the upper class of "English" society is a hell of a lot more relevant than Proto-Germanic.

      To say that the words from French are "noise" is like a person trying to compare Windows Vista and OS X, pointing out some features that may have been borrowed into Vista from OS X, and you come along and say, "That's noise -- what's really going on is that early versions of MS Windows (2.0 through 3.0) borrowed from early versions of Mac OS." Who the hell cares when you're talking about comparing Vista with OS X in their modern incarnations? The languages changed a lot in the interim.

      In addition, the fact that idioms are shared between English and French isn't particularly because English speakers borrowed the idioms from French.

      For many idioms, it is. And yes, lots of major works were translated in the past few centuries, but the GP wasn't talking about how Shakespeare and Alexander Pope are the source for a bunch of modern English, but rather that there was a significant influence of French vocabulary, idioms, and ideas that got superimposed on top of the Old English core lexicon and syntax.

      In sum, you're not wrong about what you say in terms of the ultimate source of the core of the languages. But I think you completely missed the point of the GP's post in your pedantry about "descent" according to historical linguistic trees. Those trees are only the starting point for languages, and the influence of French on the changes to English in the past millennium was a lot greater than Proto-Germanic or some generic idea of pan-European cultural exchange.

    4. Re:No, no, NO. by AthanasiusKircher · · Score: 1
      One more thing...

      Also, every major work of prose from most West European countries since the Renaissance has been translated to the languages of the others and read extensively, so that yes, "I think, therefore I am" is as much of a stock phrase as "être ou ne pas être, telle est la question" is one in French.

      Your use of the phrase "has been translated to the languages of the others" in the context of this argument implies that you're giving an example of French being translated to English when you quote Descartes. Of course, Descartes actually wrote his Meditations in Latin, so I'm not sure what your example has to do with anything other than that the intellectual language of Europe in the 17th century was Latin, and important works were translated into the vernacular.

      While you clearly know more about linguistics than the GP, your pedantic analysis misses the whole point. You couldn't even be having an argument about "descent" in the English language that you insist is so Proto-Germanic, since the word "descent" itself is an example of an English word that actually descends from medieval French -- which you have to admit even in your pedantic worldview of historical linguistics -- as do many vocabulary words and idioms, as the GP was asserting.

  67. Re:Converting that article from English to Chinese by BikeHelmet · · Score: 1

    Keep in mind that the translation algo is most suited for regular grammar. Not the gobbledeegook it outputs. Grammar -> Chinese gobbledeegook -> English gobbledeegook is a pretty decent translation.

  68. I'm not that impressed by Anonymous Coward · · Score: 0

    I don't believe an entirely stochastic approach is the solution to the problem of correctly translating text between languages. Sure, Google does well at translating Spanish to English, but its translation from Russian to English is god-awful, and its translation of Kafka from German to English is lacking. The problem with stochastic is that it often neglects to include grammatical or morphological rules. The Russian translation example really brings this to light, as it translates perfective words poorly: "pokoncheno" (has ended) as "to end", and "ostonovleno" as "stopped" (rather than "has stopped").

  69. What I mean is average? by mestar · · Score: 1

    This one is fun:

    What is the word with most meanings?

    What words have any meaning?
    What is the meaning of the word?
    What is the meaning of a word you do?
    Meaning of words is what to do or what?
    What is the meaning of words can do something?
    Mean I can make any kind of words?
    I can be what the average of the word?
    Average word what I can?
    Average words What can I do?
    Average words I can do?
    What words do I average?
    What I mean is average?
    What is the average that I mean?
    What is the average of that mean?
    What do you mean average?
    I mean what I mean?
    What do you mean what I mean?
    I mean what you mean why?
    What do you mean I mean why?
    What do you mean mean why?
    What you mean why?
    How do you mean why?
    I mean why is that?
    What I mean why?
    How do you mean why?

    It is doubtful that this phrase will ever reach equilibrium.

  70. Re:Converting that article from English to Chinese by kennycoder · · Score: 1

    Where do you think "All your base are belong to us" came from?

    --
    Fucking a fat girl is like riding a scooter... it's fun 'til someone sees you.
  71. Re:Converting that article from English to Chinese by Virtual_Raider · · Score: 1

    This doesn't actually mean the translation is any better: all it means is that the Chinese generated by Babelfish is more easily translated back to english, perhaps because it makes even less sense in Chinese. A translation function could be conceived which is a strict, reversible bijection, so that playing this translation game would give you your original English back, word-for-word. Doesn't guarantee that the intermediate Chinese step is in any way comprehensible.

    I thought your post was really interesting so I tried it myself. The following is the Spanish translation, with the bits that are off or don't make sense in italics and the way I would translate it in bold. The bits in bold parenthesis are omissions from the original translation...

    "...(el) Rápido ascenso de Google para a los escalones superiores de la traducción es un recordatorio de lo que puede suceder cuando Google libera su potencia de cálculo bruta vigor a contra/sobre problemas complejos. La red de centros de datos que se construyó para búsquedas en la Web puede ser ahora, anclados al suelo juntos conjuntamente, (el) equipo más grande del mundo. Google está utilizando la máquina para empujar los límites de la tecnología de traducción.

    Feeding it back it's own translation:

    "... Google's rapid rise to the upper echelons of the translation is a reminder of what can happen when Google releases its brute force computing power to complex problems.'s Network data center that was built to search the web may be now, when lashed together, the world's largest computer. Google is using the machine to push the limits of translation technology

    Feeding it mine (removing the italics text and adding the bold)

    "... Google's rapid rise to the upper echelons of the translation is a reminder of what can happen when Google released its raw computing power against complex problems. The network of data centers that was built to search the web can now, together, (be)the world's largest computer. Google is using the machine to push the limits of translation technology.

    Either is way better than what comes out of Babblefish by a mile

    --
    +Raider of the lost BBS
  72. Re:Converting that article from English to Chinese by spazdor · · Score: 1

    What's especially good is when it has Japanese words for the translation from English, but it doesn't have an English translation for those same words- so you get a random chunk of Romanji in the middle of an otherwise normal gibberish Engrish sentence.

    --
    DRM: Terminator crops for your mind!
  73. Ob. Lebowski by MrEd · · Score: 1

    Good times.

    "But sometimes, there's a man – and I'm talking about the Dude here – sometimes, there's a man, well, he's the man for his time and place. He fits right in there. And that's the Dude. In Los Angeles."

    "But sometimes, man - you can go anywhere - even in some cases, men, men of his time and place. He is the right fit. Order. In Los Angeles."

    --

    Wah!