Slashdot Mirror


DARPA Starts Ultimate Language Translation Project

An anonymous reader writes "Defense Advanced Research Projects Agency (DARPA) has launched the ultimate speech translation engine project that would be capable of real-time interpretation of television and radio programs as well as printed or online textual information in order to be summarized, abstracted, and presented to human analysts emphasizing points of particular interest." If combined with the tower of babel project we discussed earlier, it could only lead to awesomeness.

123 comments

  1. Wow that would be handy by 2.7182 · · Score: 1

    to understand DARPAese. (Try reading some of their PPT slides.)

    1. Re:Wow that would be handy by Zarniwoop_Editor · · Score: 1

      It certainly would be handy. I suspect that as even skilled human translators can mess things up at times that expecting any kind of automated system to produce good results will be an accomplishment indeed!

      --
      - F1 NEWS
    2. Re:Wow that would be handy by 2.7182 · · Score: 4, Interesting

      Seriously though, I just don't believe it. I've worked on a number of DARPA robot projects, and have heard a lot of their babble. They claim to be funding all these fantastic ideas, but none of them ever work except in a limited capacity. The robot projects I worked on were very lame in that DARPA created these really specific environments for the robots that were light years away from what they were saying they were really going to do. All of the Universities involved failed to accomplish even the simplest tasks. So my experiance with them is that they talk a big talk, and no one ever goes back to check "hey did you really ever do that ?" Now granted, some of their work is supposed to be high risk, but they never emphasize which projects are expected to have a high failure rate. Largely because they don't care. It's really all about funding your academic buddies or whoever is going to be able to scratch you back in some way. It is very much an old boys network, with an emphasise on PR and not much about real science. Much like the MIT media lab. (Just thought I'd get another jab in there....)

    3. Re:Wow that would be handy by jotok · · Score: 1

      They claim to be funding all these fantastic ideas, but none of them ever work except in a limited capacity.

      Like that internet boondoggle?

      I keed, I keed :)

    4. Re:Wow that would be handy by 2.7182 · · Score: 1

      That was ARPA, not DARPA. And I have little doubt that the culture of ARPA in the 60's is different then the culture of DARPA now.

    5. Re:Wow that would be handy by Anonymous Coward · · Score: 0
      It would also be handy to be able to interpret idiots who post half their thought in a box labeled "Subject" and the other half in "Comment".

      That is so rude, dude. Learn how to use a subject field. Learn how to speak in full sentences. geez. Slashdot is full of you idiots.

    6. Re:Wow that would be handy by hkgroove · · Score: 1

      Same group, slightly different name over the years. Just like Prince.

  2. Autobots, Transform! by eldavojohn · · Score: 1
    Defense Advanced Research Projects Agency - DARPA - is working on the ultimate speech translation engine that would be capable of real-time interpretation of television and radio programs as well as printed or online textual information in order to be summarized, abstracted, and presented to human analysts emphasizing points of particular interest.
    In unrelated news, a user named DARPABOT has made the Slashdot Hall of Fame under most active submitters at over 1000 in under a few weeks time, crushing prostoalex.

    If combined with the tower of babel project we discussed earlier, it could only lead to awesomeness.
    By 'lead to awesomeness' do you mean 'lead to you not having to attempt to edit summaries and fail at both grammar and spelling'?
    --
    My work here is dung.
  3. Take us to your leaders... by Noryungi · · Score: 1

    ... Now in 54 flavours! :-)

    --
    The right to offend is far more important than the right not to be offended. (Rowan Atkinson)
  4. Awesome? WTF?? by BadAnalogyGuy · · Score: 4, Insightful

    If you consider that now the government will be able to spy on you in your native language to be awesome, then I suppose giving the Feds this sort of technology can only lead to awesomeness.

    Surveillance of civilian populations under the guise of "monitoring terrorists" is not something that I'd consider awesome. Irksome, yes. But not awesome.

    1. Re:Awesome? WTF?? by CRCulver · · Score: 4, Interesting

      As if the government doesn't already have legions of translators at the ready. Military linguists are trained at Defense Language Institute at the Presidio of Monterey. I studied Chinese there while serving in the Navy, and while most of my fellow enlisted servicemen were likewise studying languages of some clear strategic value, there are also courses in various other languages for officer exchange programs, as well as the occasional course in something really exotic. Combined with the simple possibility of the government paying a native speaker to work for them, this means that the government already has the language skills it needs even without a whizbang translation machine.

    2. Re:Awesome? WTF?? by Anonymous Coward · · Score: 0

      If you consider that now the government will be able to spy on you in your native language to be awesome, then I suppose giving the Feds this sort of technology can only lead to awesomeness.
      Uh, about 99% of the people to whom this is "the" government speak English as their native language. So, uh, it's not really changing anything, even if it works, which it won't, because machine translation is about as real as Star Trek.

      Just sayin'.

    3. Re:Awesome? WTF?? by b0s0z0ku · · Score: 1
      As if the government doesn't already have legions of translators at the ready.

      Assuming that this system can recognize voice well, and then convert it into text in preparation for translation, this is already saying a lot. This means that phone conversations can in theory be automatically logged as text, which requires much less storage space than audio.

      -b.

    4. Re:Awesome? WTF?? by TechnoBunny · · Score: 1

      Wouldnt the original audio need to be stored as well, for evidential reasons?

    5. Re:Awesome? WTF?? by b0s0z0ku · · Score: 1
      Wouldnt the original audio need to be stored as well, for evidential reasons?

      Depends what you want to do with it, and assuming that our court system is intact and more or less unchanged in 20 years. Besides, there's always the option of kidnapping and "disappearing" miscreants. I'd hate to see what would happen, with the full consent of the majority of the lumpenproleteriat, if another 9/11-scale (or worse) terrorist attack occurred on US soil.

      -b.

    6. Re:Awesome? WTF?? by Anonymous Coward · · Score: 0

      > This means that phone conversations can in theory
      > be automatically logged as text

      Last I checked the NSA already has filed patents a couple years ago for just such a thing. IIRC the patents covered a text-to-speech method and subsequent Google-style search of the transcribed conversations. This may already be in full world-wide use via Echelon etc..

    7. Re:Awesome? WTF?? by DragonWriter · · Score: 1
      Wouldnt the original audio need to be stored as well, for evidential reasons?


      The Department of Defense isn't particularly interested in evidence. Indeed, in many cases once they have the information they need to make a decision and the decision is made, it seems they'd be happier if the underlying original data was irretrievably lost to prevent any after-the-fact criticism of either their decisions or their methods.
    8. Re:Awesome? WTF?? by cyberon22 · · Score: 1

      Considering that I've personally handled documents that the US embassy has "outsourced" for Chinese-English translation in Beijing, I think your confidence that the US has enough skilled translators in-house is grossly misplaced. Chinese is my second language and I am good at it, but I am not even an American citizen. And although I can't speak for military training, I have met people who have been trained by the State Department and found that very few of them have pushed beyond middling Chinese despite having serious advantages in time and funding for training.

      That being said, let me praise as exceptions those people I've met who trained at IUP when it was in Taiwan, and less so in Beijing. With those few exceptions, everyone I've met who is remotely decent at Chinese has spent considerable time in China and mops the floor with those who have studied abroad.

      As far as the tech goes, I'd respectfully suggest you're wrong on the need for more and better tools. The problem is that translation is really a small and limited domain for the use of bilingual NLP systems. I run an educational project which is doing something the private sector simply is not: developing Chinese-English machine annotation and translation technology with a focus on providing educational annotations for students. The technology itself is quite cool and if you are still studying Chinese you should actually check it out. We are currently the only place on the web where you can get word-by-word explanations of everything from newspapers to excepts from classic novels like Dream of the Red Chamber.

      Getting back to the point.... the market that is emerging for this sort of technology is happening in places like enterprise search, contextual analysis of Chinese documents for search and other areas where there is a need for massive data analysis and you simply cannot rely on human translators. The real tragedy is that the market is not here yet, and the money is all going into closed research programs like those cited in the article. Seeing the US government simply stuffing millions and millions of dollars into closed research consortiums does not help what we are doing one bit. Nor does the status quo is a situation through which NIST holds "open" machine translation competitions where the results are not even made available for public comparison. More's the pity....

    9. Re:Awesome? WTF?? by 4D6963 · · Score: 1

      now the government will be able to spy on you in your native language

      I can imagine your typical terrorist conversation translated using this system :

      -After this the friends when it jumps the operation?
      -Not white I go seeing
      -Into the correspondence, and to know, you have?
      -I am caused
      -And on the other hand are their blond as?
      -It goes, or
      -Or he has
      -The God is large!
      -The God is large!

      --
      You just got troll'd!
  5. Ultimate Defense by Speare · · Score: 4, Funny

    Just feed this new system a few reruns of Japanese television game shows. After that, we will be safe from automated snooping for at least another decade. As a plus, all artificial intelligence projects at the DARPA will be set back by another decade as well.

    --
    [ .sig file not found ]
  6. Humans??? by Anonymous Coward · · Score: 2, Insightful

    What's wrong with using humans? This is exactly what humans are good at. While there most certainly are fields where machines can replace humans, this is _not_ one of them.

    http://lyricslist.com/lyrics/artist_albums/16/ac-d c.php/

    1. Re:Humans??? by 3waygeek · · Score: 2, Funny

      Yes, but Hoshi Sato won't be born for another 100 years or so...

    2. Re:Humans??? by Protonk · · Score: 4, Interesting

      One of the problems with using humans is that they are expensive--the other is that they become bored easily. It isn't like the defense establishment isn't using human translators, the NSA is the largest employer of translators in the world. They use humans in every listening post out there, but for the same reasons that humans make lousy airport security sceeners, they make poor translators AND intelligence analylists. This isn't saying that machine translators are a panacea, but they can solve a small section of the problem that we have been trying to solve with a very human capital intensive solution for years now.

    3. Re:Humans??? by JonTurner · · Score: 1

      >>What's wrong with using humans?

      They're slow, and scarce and don't work 24-7. *If* the software has progressed to the point that it's "good enough" (that's a big IF) then a massive farm of machines could simultaneously monitor all communications (tv, email, phone, IM, etc.), summarize, and filter out anything interesting, looking for trends. Think Really Big Brother.

    4. Re:Humans??? by DragonWriter · · Score: 2, Insightful
      What's wrong with using humans?


      The number of humans that the Pentagon can afford to employ with adequate skill in the languages it wants to target are inadequate to process all the channels of information it would like to filter for potentially interesting information, further, the more humans know what information is being looked for (and what is flagged), the greater the security risk.

    5. Re:Humans??? by Scott7477 · · Score: 1

      I lived in Japan for two years and earned a minor in the language at a major private university in the US. Toward the end of my time in Japan, my skill with the language was sufficient that when I spoke to native Japanese on the phone, frequently they thought I was actually Japanese rather than a foreigner. I seriously considered working for the US government as a way of exploiting my Japanese skills, but I concluded that spending my days translating the kind of documents that the US government would be interested in would be incredibly, horrifically boring.

      --
      "Lack of technical competence coupled with the arrogance of power, as usual, leads to no good end."
  7. Cool by thejrwr · · Score: 1

    Dragon Speak is going out of business of guess

    1. Re:Cool by aicrules · · Score: 1

      Naturally!

    2. Re:Cool by Kagura · · Score: 1

      Naturally?

    3. Re:Cool by aicrules · · Score: 1

      Naturally!!

  8. awesomeness, in terms of megatons by JonTurner · · Score: 1

    awesomeness, to be sure, if you consider the ultimate outcome. Remember, this is DARPA, so they're looking at potential military applications. I read it as: "translate (military) communications in real-time, ... then destroy one or both parties."

    1. Re:awesomeness, in terms of megatons by Milwaukee · · Score: 1

      DARPA research & innovation has had many positive impacts on our lives. The most obvious example is the internet. Presuming it is successful, this new translation technology would definitely shrink the globe again, bringing us all closer. Isn't that a good thing?

    2. Re:awesomeness, in terms of megatons by the+eric+conspiracy · · Score: 1

      this new translation technology would definitely shrink the globe again, bringing us all closer. Isn't that a good thing?

      I don't what to know what people really think of me.

    3. Re:awesomeness, in terms of megatons by Anonymous Coward · · Score: 0
      ...bringing us all closer. Isn't that a good thing?


      No, it isn't; it's fscking terrible. More and more people are moving closer and closer to my cave every year!

  9. When will it affect me? by jandrese · · Score: 2, Interesting

    All I really want is a free online translator for web pages (ala Babelfish and Google) that aren't terrible at it. Seriously, the quality of Babelfish translations has stayed constant since it came on the scene in the late 90s, even though machine translation in general has made some rather significant advances. I don't really use them enough to justify plopping down $500 on the professional packages, but the current systems are just terrible.

    --

    I read the internet for the articles.
    1. Re:When will it affect me? by Creepy+Crawler · · Score: 2, Funny

      You get what you paid for.

      --
  10. w0t j00 s4y by Anonymous Coward · · Score: 0, Offtopic

    w0nT b3 phUn 1ph th3y d0nT d3w l337sp3ak

    owned

  11. May I be the first to say... by Anonymous Coward · · Score: 0

    Es una trampa!

  12. can't help it by Programmer_In_Traini · · Score: 1

    its lame, its old and yet i cannot help it

    i can see a translated japanese movie coming...

    "all your base are belong to us, make your time"

    --
    If you look like your passport photo, you're too ill to travel. - Will Kommen
  13. My Criteria For Success by Anonymous Coward · · Score: 0

    As long as it can handle swear words and a good variety of sex acts I'm interested in I say thumbs up!

  14. Too much too soon, or tackling wrong problem? by Salvance · · Score: 5, Interesting

    This project, along with CMU's Tower of Babel, certainly get props in the coolness category, but the practicality is still lacking. I believe DARPA is barking up the wrong tree for now, or at least biting off more than they can chew.

    Speech Recognition is the hardest problem to tackle on the path to recognition, and MUST be addressed before there is a viable real-time (or even delayed) translation engine. Currently, even the best speech recognition software can achieve at best ~80% accuracy when faced with a large vocabulary with no limits on speakers/dialects, and this level of accuracy is typically not achieved in real-time. While this 80% level is actually pretty good when transcribing to text (since the reader can typically decipher what the computer meant), it's downright awful if trying to translate the resulting text to another language.

    For example, if I say "I like ice cream" into voice recognition software and 'hears' "I like, I scream", the reader might understand what this means, particularly if they say it in context and aloud. However, let's say we translate each sentence into Spanish ("Tengo gusto del helado" and "Tengo gusto, yo grito" respectively, according to Babel Fish), and the speaker would be completely lost as the out of context phrases don't sound anything alike. In a natural language translation, even under relatively accurate recognition scenarios, would be frought with misunderstandings.

    Once speech recognition is tackled, it's just a matter of translation then voice synthesis. Fortunately these problems aren't nearly as difficult, and current solutions would suffice (with the only pitfall being poor grammer in the destination language, and a robotic sounding voice).

    --
    Crack - Free with every butt and set of boobs
    1. Re:Too much too soon, or tackling wrong problem? by Rocketship+Underpant · · Score: 1

      "Fortunately these problems aren't nearly as difficult, and current solutions would suffice (with the only pitfall being poor grammer in the destination language, and a robotic sounding voice)."

      *Good* translation is extremely difficult unless you stick to "see Dick run"-type sentences. Good translation between non-related languages (like Japanese and any Indo-European language) doubly so.

      Advanced machine translation on par with human will require nothing less than artificial intelligence, most likely.

      --
      He who lights his taper at mine, receives light without darkening me.
    2. Re:Too much too soon, or tackling wrong problem? by Anonymous Coward · · Score: 0

      State of the art commercial speech recognition such as Dragon Naturally Speaking 9 is better than you suggest - accuracy percentage is in the high 90's, and it has sufficient understanding of English to resolve phonetic ambiguities... it's not relying on words sounding different to pick the right one. Huge vocabulary, continuous speach, and close to real-time too (a second or so lag).

    3. Re:Too much too soon, or tackling wrong problem? by Red+Flayer · · Score: 1

      "I believe DARPA is barking up the wrong tree for now, or at least biting off more than they can chew."

      I think what you're trying to say is that DARPA isn't capable of developing speech recognition software equal to the task of real-time translation.

      I'm sure that DARPA is fully aware that the biggest block to real-time translation is speech recognition -- that's why they are funding this project -- because it is (1) beyond the scope of what private enterprise is currently capable of without cash influx and (2) because it would be extremely useful to have such a tool, and to have the IP rights to it.

      Speech recognition is a part of translation, not outside it.

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    4. Re:Too much too soon, or tackling wrong problem? by cyberon22 · · Score: 1

      Not sure about Arabic which is probably where the moeny is right now, but speech recognition is much less of a problem in languages like Chinese than in romance languages like English which conjugation, inflection and the sort.

      On the other hand, text translation is much harder in Chinese than in romance languages, in large part because of the lack of conjugation, etc.

    5. Re:Too much too soon, or tackling wrong problem? by Anonymous Coward · · Score: 0

      Your 80% estimate is innacurate. Please cite your source.
      I am a developer for a LVCS engine which is more towards the 90% accuracy mark.
      For a concrete example, the CMU sphinx 3.5 decoder can be tuned to achieve ~8% WER (word error rate) at about 3xRT.

      Translation is actually the hard part in this system. MT (machine translation) is a difficult task, but _a lot_ of money is being put into it right now (especially by Google). There should be some significant breakthroughs in the next decade. Projects like this one only help spread the academic knowledge and allow companies to implement concrete solutions for these broad tasks.

    6. Re:Too much too soon, or tackling wrong problem? by maxume · · Score: 1

      DARPA just wants a way to figure out what documents are most interesting to give to a human to translate. I don't think the 80% accuracy is going to be all that huge a problem for something like that.

      --
      Nerd rage is the funniest rage.
    7. Re:Too much too soon, or tackling wrong problem? by Bazouel · · Score: 1

      They should also start by making a translation engine that actually works...

      See, even for the most simple translation of "I like" in Spanish, Babelfish is wrong. The good translation is "Me gusto", not "Tengo gusto" which means "I have taste".

      I am trying to learn Cantonese and you have no idea just how "stupid" the current translators are...

      --
      Intelligence shared is intelligence squared.
    8. Re:Too much too soon, or tackling wrong problem? by Olivier+Galibert · · Score: 1

      It's broadcast news in arabic and mandarin. I don't have the latest numbers handy but 80%, well, 20-25% error rate, nobody in the ASR community uses accuracy, is pretty much correct for arabic. I think mandarin is better though. Arabic has some unique problems, the main one being that there is no such thing as arabic in the first place.

          OG.

    9. Re:Too much too soon, or tackling wrong problem? by xtracto · · Score: 1

      And to make things worse, "I like" does not means "Me gusto" but "Me gusta" as the first person conjucation of the verb.

      So, what great parent wrote as "I like iceream" would be translated as "Me gusta el helado" and "I like, I scream" would be "Me gusta, yo grito".

      I agree with GP about the speech recognition problem being one of the problems to cope before having a real-time translation tool. Some have said that the current technology (Dragon Speaking 9, etc) achieve 90% of accuaracy, but the issue is that the 90% is on ideal conditions, try the experiment of recording a Friends show (or simpsons or any other tv program) and try to introduce it with those programs. There is so much noise and variation in the speech that they cant make any sense.

      Of course the translation problem is another difficult problem. Now my question is, isnt it possible to avoid the Speech to Text conversion in translation? wouldnt it be possible to translate directly from voice? it would mean translating a specific sound to a equivalent sound in another language which would mean having huge sound databases but hey, that might work and you will have to beat only one problem.

      --
      Ubuntu is an African word meaning 'I can't configure Debian'
    10. Re:Too much too soon, or tackling wrong problem? by avir · · Score: 1

      I beg to disagree with you on the relative difficulty between speech-to-text (STT) and machine translation (MT). The state-of-the-art in broadcast news transcription is currently over 90% accurate - using 100 minus word error rate (WER) - in English and close to 90% in both Arabic and Chinese. Also, English conversational telephone speech transcription reached over 85% accuracy during the DARPA EARS program. However, translation accuracy - using 100 minus human-mediated translation error rate (HTER) which is the official metric in DARPA GALE - is only around 80% on both Arabic-to-English and Chinese-to-English.

      To counter your last statement, experiments carried out before the GALE 2006 evaluation showed that the translation accuracy of STT output is pretty much the same as the translation accuracy of the STT reference transcripts. This is clearly due to the poor performance of the current state-of-the-art MT. Most of the research in GALE is currently tackling MT and only when the MT is good enough, the STT errors will begin to make a difference.

    11. Re:Too much too soon, or tackling wrong problem? by teneighty · · Score: 1

      I had no idea STT of coversational speech was so good now. This is of extreme interest to me because it would help me enormously to have access to software that could do real-time STT of telephone coversations (I have a hearing loss). What software is being used to get these kinds of results?

    12. Re:Too much too soon, or tackling wrong problem? by avir · · Score: 1

      The conversational telephone speech (CTS) results I quoted above were achieved using a state-of-the-art research system running under 10 times real time (10xRT); i.e., using less than 10 hours to transcribe an hour of speech. The winning system in 2004 DARPA EARS evaluation achieved 15.2% WER. For system description, see this paper (requires subscription to ieeexplore). In 2004, many EARS teams achieved the same level of performance in real time as their 10xRT system in 2003. Since EARS program was killed after 2004 evaluation and DARPA's focus has shifted to foreign languages (GALE), it is hard to predict the current state-of-the-art in English CTS transcription and when that level of performance will be available in commercial products.

      Just to correct my earlier post, Arabic broadcast news (BN) transcription error rates are still around 20%. Mandarin Chinese BN character error rate is close to 10%.

  15. I read that as... by rHBa · · Score: 1

    I read that as "Defense Against Research Projects Agency"!

  16. This could be dangerous... by FuryG3 · · Score: 2, Funny

    Ultimate Language Translation researchers should probably compete along a stretch of the Mojave desert so as not to injure or offend nearby native speakers

  17. Civilian use of such a thing by cucucu · · Score: 1

    Obviously such a thing will not work well without an advertising filter (imagine an analyzer sifting through washing powder ads).

    So they will have to develop one.

    This will be integrated into VCRs to stop/start recording when advertising starts/stops.

    Great!

  18. Yeah... Billions of Dollars Later... by Dark+Leaper · · Score: 1

    Slashdot headline ten years from now... "Creators forgot to implement 1337 speak into translator matrix..."

  19. yeah.. awesomeness... by Churla · · Score: 1

    Or the WRATH OF AN ANGRY GOD! heh

    "Hey... didn't I make them all speak different languages to teach those uppity humans a lesson? Now they what? The end routed me on that one? Oh I don't think so!"

    --
    I'm a fiscal conservative, it's a pity we don't have a political party anymore
  20. It's coming true! by saintory · · Score: 1

    The space station is being built again. India is planning manned missions into space. A shift in power in the US Government. Now we're creating a Universal Translator! How exciting these times we live in.

  21. Really I hate it by Feminist-Mom · · Score: 1

    when people do that.

    But seriously, I've basically made my living off of DARPA grants and I fully support the criticism leveled at them above. It is truly a classic government buearacracy, very wasteful, not entirely straight about what they are doing, and you have to have personal connections to get money from them.

    1. Re:Really I hate it by Beyond_GoodandEvil · · Score: 1

      It is truly a classic government buearacracy, very wasteful, not entirely straight about what they are doing, and you have to have personal connections to get money from them.

      In other words, just like 90% of the rest of the world. ;)

      --
      I laughed at the weak who considered themselves good because they lacked claws.
  22. But will it translate... by Billosaur · · Score: 1

    ...Romulan, Klingon, and Vulcan?

    --
    GetOuttaMySpace - The Anti-Social Network
  23. They need to start working on... by shirizaki · · Score: 1

    The metal gear project. I mean, honestly, the DARPA chief isn't going to be jailed in the secret arctic base for the ultimat language translator.

    --
    In Soviet Russia, dots slash you!
  24. Interesting, by DragonWriter · · Score: 2, Insightful

    Seems likely to be very useful for specifically what they suggest it is for (flagging potentially interesting material for further review by human analysts, a kind of time-saving filtering device for the limited pool of translators available.)

    But beyond that, I wouldn't give too much faith in any kind of mechanical translation as particularly reliable on its own except on narrow kinds of material. It conceivably might work for strictly literal usages, or for fairly stable idiomatic uses, but unless you have frequent collection and incorporation of usage data from every culture and subculture that may be a source of translated material, its going to fail, sometimes subtly and sometimes spectacularly, for a lot of idiom. Similarly, even within the same language, different groups using it will have different idiomatic uses that sometimes will produce different or opposing meanings for similar usages, which will require accurate identification of the source at more than just the language level to get correct results from. There's a lot of evolving cultural context that informs the use of language...

  25. exponential growth by Silver+Sloth · · Score: 1
    From TFA
    As you can see all these projects are a far cry from what DARPA wants. But given time and money something more advanced would surely come out and eventually would be available for civilian use as well.
    Well, err, yes, but, I have enough difficulty understanding Jordies and Glaswegans, and they're speaking the same language as me (nominally). Understanding 200 or so words when carefully spoken is a huge step from simultaneously interpreting random speech and I'm sure the problems will rise exponentially. File this one under the 'maybe, someday, but don't hold your breath' dept.
    --
    init 11 - for when you need that edge.
    1. Re:exponential growth by BertieBaggio · · Score: 1

      Actually, depending on who you talk to, Glaswegians might well be speaking a different language from you. Geordies still speak English though.

      --
      If all you have is a grenade, pretty soon every problem looks like a foxhole -- MightyYar
  26. Lots of reasons by phorm · · Score: 4, Insightful

    To properly translate all the nuances of some languages actually requires a lot of skill, and sometimes translating can be ask much interpreting as anything. Granted, this is something a human could handle better than a machine, but the problem is that humans also have a bias. Yes, there have been cases wherein human translation has caused problems because of bias or even due to being outright wrong.

    I reminds me of the old joke:

    Guard: Now tell me where you hid the money, or you will suffer
    Translator: Tell him where the money is, or you will suffer
    Prisoner: I'll never speak
    Translator: He says he won't tell you
    Guard: *putting gun to prisoner's head* Tell him I will blow his brains out if he doesn't tell me immediately
    Translator: He will shoot you in the head unless you tell him now
    Prisoner: I buried a million dollars under the floorboards in the old woodshed
    Translator: *pauses* He says you don't have the guts to shoot him...

  27. The REAL universal translator by SethEaston · · Score: 0

    While being still quite far from an adaptive Star Trek-style universal translator, it is conceivable that one day with the help of portable devices (like Palm/cell phones/iPod) that we could indeed have an on-demand personal translator that would work anywhere. I think this is the beginning of such a capability.

    1. Re:The REAL universal translator by thewils · · Score: 1
      I think this is the beginning of such a capability
      ...in the same sort of way that flopping out of the oceans was the beginning of Homo Sapiens. I wouldn't hold your breath for one though.
      --
      Once I was a four stone apology. Now I am two separate gorillas.
  28. Seriously though, I just don't believe it. by Anonymous Coward · · Score: 3, Insightful

    The US department of Defense is openly claiming to be able to solve one of the world's hardest AI problems, and you don't believe it? Big surprise.

    If the US military had anything close to real A.I., you wouldn't hear about it. It would be a classified information.

    The NSA would love to have anything close to a system capable of understanding language as well as a native speaker can; as would the CIA, or any other clandestine organization. Any system smart enough to understand and generate English probably also came with a breakthrough in CS theory that will give them better tanks, planes, and communications systems. And those would be classified, too.

    In short, this is just an excuse to spend money, and to hide the funding for any secret research projects that they really are working on.

  29. screaming "failure" by sohp · · Score: 1

    What is it about "ultimate, do-everything" project sponsored by the government that sets off every alarm bell signaling imminent failure?

  30. Back in 2000 by Anonymous Coward · · Score: 0

    Back in 2000 or 2001, I saw IBM television ads about telephones that translate in real-time. In the ad an English-speaking woman was speaking to a Turkish person, with the telephone doing the translation. This was promised in the "near future" by IBM. Anyone know what became of it?

  31. "My hovercraft is full of eels." by Anonymous Coward · · Score: 0
    "My hovercraft is full of eels."

    By the way, mod parent up. "Offtopic" is not a good replacement for "I don't get the joke".

    1. Re:"My hovercraft is full of eels." by Jinky+Williams · · Score: 1

      Agreed.

      i /\/\34|\| i (4|\| |)0 t3|-| |-|4|2|)(o|23 1337 7|-|47 |-|4$|\|7 833|\| |_|$3|) $!|\|(3 \/\/4|23Z |)4'/$, 4|\||) \/\/|-|!13 i7$ |\|07 |)1|=|=!(|_||_7 70 7|24|\|$|_473 !7 |_4|293|_'/ |_|\|!|\|73|_|_!9!8|_3 70 |\/|0$7 |>30|>|_3 4|\||) 7|-|3|23|=o|23 4 900|) $|_|8_|3(7 |=0|2 7|24|\|$|_47!0|\|

  32. human vs machine translation by Anonymous Coward · · Score: 0

    that may be true for most languages, but just about every government defense/security agency i've heard of is desperate for speakers of arabic, persian, malay, javanese, pashtu, etc. And they much prefer native speakers, for obvious reasons -- very few people ever achieve native-like proficiency in a language they learn after 12 or so (the critical period for language acquisition is pretty brief, sadly).

    that said, a skilled non-native translator will no doubt beat the crap out of a computer. this DARPA project sounds pretty far-fetched, even for them. i doubt they'll end up with much more than the rudimentary string of words altavista creates. (seriously, if you want a good laugh, hook babelfish up to a chinese or japanese newspaper -- see if you can guess what the headlines are actually about).

  33. Defense Language Institute by Kozar_The_Malignant · · Score: 1

    >the government already has the language skills it needs even without a whizbang translation machine.

    Sadly, they don't. The FBI has something like two guys who speak Arabic, and there are numerous instances in the news recently where some fed is bewailing the lack of language skills in his department. On a diplomatic note, how many US Ambassadors actually speak the language of their host country? It might be useful if they had some way to understand the locals.
    --
    Some mornings it's hardly worth chewing through the restraints to get out of bed.
    1. Re:Defense Language Institute by zebul0n · · Score: 0

      Why would they need to understand them when they b0mb them?

    2. Re:Defense Language Institute by o2sd · · Score: 1

      Sadly, they don't. The FBI has something like two guys who speak Arabic, and there are numerous instances in the news recently where some fed is bewailing the lack of language skills in his department. On a diplomatic note, how many US Ambassadors actually speak the language of their host country? It might be useful if they had some way to understand the locals.

      At the time of the Islamic Revoluation, the CIA had one employee who spoke Farsi, and they weren't listening to him anyway. I can't imagine much has changed, unless they have employed some ex-patriots, like what they did with that Achmed Chalabi guy, and that was a fantastic success.

      --
      - Nothing to see hear.
  34. Better yet... by Anonymous Coward · · Score: 0

    Headline: "Lack of translator support kill "l337" and "IM" speak. World rejoices as collective intelligence level rises."

  35. Maybe not! by Anonymous Coward · · Score: 0

    fahqtut [fully automated high quality translation of unrestricted text] will never happen, boys and girls. Language is too diverse and expressions are often untranslatable. I work in this industry and all the technology I've seen sofar is laughably inadequate. Yes, you can have accurate translations for a restricted range of text, straightforward, present tense stuff sure, it'll work mostly. As soon as you start with the real language that people use your system hits the bricks in a decidedly unflattering manner.
    When complex sentences with multiple clauses are used [and this is stuff you're translating, right?] it just cannot keep up. On top of that there's noise in the background, interruptions, people don't enunciate well, use crummy language, use the wrong word altogether, contradict themselves in the same sentence, start sentences they don't complete. This DARPA uber translator thingamajig is going to just come along and happily munch through all that gibberish and deliver crisp and pristine language? Ain't happenin'.

    They've been playing with this technology for decades now. Language is our most flexible, most diverse tool. The only way they're ever going to make that work with any measure of reliability is when they figure out how brains work and find a way to build one.

    Hey, somebody is going to be working with some exciting technology. I say: let them play.

  36. Language parsing impossible by current technology by master_p · · Score: 2, Insightful

    In order to recognize speech, one needs a context-sensitive parser. In order to make a context-sensitive parser which is fast enough to interpret the text, the computer should have the pattern-matching capacity of a grown up human. The human brain contains 500-1000 trillion synapses! even if one makes the assumption that one synapse equals one bit, in order to understand the context, one would need a computer with a tremendous amount of memory which could be searched in parallel.

    Of course if you narrow the problem down to specific terms, then it is doable. But then it would not be 'ultimate' any more.

  37. The English solved this years ago by Colin+Smith · · Score: 1

    They just speak a little sslloowweerr and LOUDER! The natives usually catch on.

    --
    Deleted
    1. Re:The English solved this years ago by Velocir · · Score: 1

      And the Americans still do it...

  38. Actually... by technococcus · · Score: 3, Interesting

    I attend one of the many Universities where DOD research is currently being conducted. Portions of our graduate student body and faculty are working on the powered armor concept in conjunction with UC-Berkley (they're doing the frame and kinematics, we're doing the control theory/system and power supply). We're actually making quite a bit of progress in the field of alternative batteries (the current iteration is a peroxide-fueled hydraulic hybrid-type system widget) and mechanical interface control theory application. So, while God knows we won't see cap' troopers in 'suits any time soon, we are at least progressing towards that end while developing widely applicable technologies along the way (this is, if I may remind you, the way many technologies we love dearly were spun off from the space program et. al).

  39. Politics of scarcity? by seven+of+five · · Score: 1

    You'd think the FBI would be a prime customer for something like this, but apparently keeping a huge backlog of documents to translate and a staff that's too small to handle it is more important to the mechanics of their bureaucracy.

    The point being, if this tech works, great, but will it be used?

  40. You don't know what you're talking about... by Olivier+Galibert · · Score: 1

    Translation *is* the hardest part in the Gale project. So much harder than in the current evaluations the translations are so bad that the impact of the ASR errors on the final result is not significantly detectable it seems. We hope that the MT teams are going to make some massive progress fast (they may, they get a *lot* of new data from the project) so that working on the ASR actually means something, but more importantly so that the project goes on.

        OG.

  41. AHA! by paralaxcreations · · Score: 1

    Take that, Wikipedia! It's not just some plot device!

    Errr...I mean, soon, it won't be a plot device anymore!

    Crap, I mean, eventually it might not be just a plot device...

    I mean...oh, fuck it. This is DARPA after all.

  42. Don't be too afraid... by Olivier+Galibert · · Score: 1

    ... ASR on people talking to each other naturally Just Doesn't Work[tm]. As in 70-80% error rate or worse.

    Gale is about TV/Radio news, not random people conversations.

        OG.

  43. You have to walk before you can run by Zontar_Thing_From_Ve · · Score: 4, Insightful

    Seriously though, I just don't believe it. I've worked on a number of DARPA robot projects, and have heard a lot of their babble. They claim to be funding all these fantastic ideas, but none of them ever work except in a limited capacity.

    This is a big pipe dream that is extremely unlikely to work any time soon. How do I know that? Right now, I think it would be reasonable to conclude that computer technology today is good enough to do accurate text translation. Can it? Well, it depends on how picky you are. There are always mistakes, sometimes glaring ones, in text to text translation programs. I can speak Russian and for convenience (to get quick rough translations) at one time I owned what is probably the best Russian-English text translation program. It's much more accurate than Babelfish. It still left a lot to be desired. It would be about 80-90% accurate, but no more. I remember one time when it took a statement in Russian that said "I absolutely would not mind to tell you about ..." and translated it as "I absolutely would mind to tell you about ..." which is the exact opposite. Many languages, such as Russian, Spanish and Portuguese (and no doubt others) use double negatives to express negation. "I don't know nobody" is quite correct in Russian, Spanish and Portuguese although it is quite grammatically incorrect in English if your intention was to say "I don't know anybody". Programs that translate into English from languages that use double negatives often fail to correctly translate the negation. Maybe there are some that get it right, but I've never seen any. Text translation programs are very poor at distinguishing between words that have uses as different parts of speech. Here's an example:

    She sings like an angel.

    In this sentence, "like" is an adverb, but it can also be a verb ("She likes to go shopping."). A text translation program might fail to correctly understand that "like" is an adverb here and say something like:

    She sings and angel is pleasing to her.

    I could give a lot more examples, but these are enough. If we can't even do a better job right now at text translation, how on earth is DARPA going to get speech translation right? This is the kind of project that gets funded by idiots who have never studied foreign languages and believe that the Star Trek idea of a Universal Translator is only a few years away.

    1. Re:You have to walk before you can run by Venner · · Score: 1

      The time-honored CS example is:

      Time flies like an arrow.
      Fruit flies like a banana.

      Always liked that one.

      --
      A preposition is a terrible thing to end a sentence with.
    2. Re:You have to walk before you can run by zooblethorpe · · Score: 1

      So then, we have:

      Time flies like an arrow:
      Time-travelling winged insects prefer an arrow

      ...and...

      Fruit flies like a banana:
      The end product is airborne as if it were a banana

      Talk about name-mangling... :)

      --
      "What in the name of Fats Waller is that?"
      "A four-foot prune."
  44. Privacy, anyone? by Anonymous Coward · · Score: 0
    it could only lead to awesomeness.


    Yeah. Computers monitoring all "public" forms of communication - e.g. telephone, e-mail, radio, &c. Key phrase[s] pop up, computer flags it, sends a message to the Thought Police, who come, and bust down your door.


    While it may not be practical to have a living human listening in on e.g. every telephone conversation going on at the same time, with government money, one could throw a lot of computing power at this issue. This is DARPA, after all.


    I, for one, do not welcome our new, Digital Thought Police, Overlords.

  45. not enough of them by peter303 · · Score: 1

    We could probably collect as many cellphone and internet messages as we want, but there arent enough people to sift through them.

    Several terrorists in Colorado Supermax prisons sent over a hundred unread Arabic letters overseas because they have just one part time guy reading them down there. Quite a scandal there.

  46. How to Wreck a Nice Beach by J.R.+Random · · Score: 2, Insightful

    Just say the title out loud to get some idea of why speech recognition is hard, nevermind translation. Translation has long been regarded as "AI-complete" because to do it well you have to understand what is being said, which involves solving all the other difficult AI problems. The current translation systems are lousy because they don't understand what is being said and most of them don't even attempt to.

    So my guess is that this program will be a boondoggle for researchers with little practical result.

  47. Oh wow...AI tanslators!!! Who would've thunk it? by Anonymous Coward · · Score: 0

    Alright...anyone who is not aware that all these AI "translators" have been a complete failure should grab a AI history book. It might be incrementally better to what's out there; however, I'm almost certain it'll fail to accomplish it set out to do...at least from any practical perspective.

  48. Primary language by Anonymous Coward · · Score: 0
    most of my fellow enlisted servicemen were likewise studying languages of some clear strategic value

    Now if we could just teach English to the Marines.
  49. You can't impeech his speach :) by Anonymous Coward · · Score: 1, Funny

    ... speach researchers at Carnegie Mellon University has recently demoed a prototype of a device ...

  50. Real-time translation huh? by gatesvp · · Score: 1

    I'm sure the multi-lingual people out there are laughing at the very concept of "Real-time translation". Unless you're doing something trivial (Italian to Spanish?), this just isn't possible.

    Some languages place verbs at the very end of the sentence. Assuming that the computer could understand each of the words, the entire sentence still has to be re-composed in English. For long sentences, the speaker has already moved on.

    Other languages, like French, use some crazy sentence structures that effectively do the same thing. A French sentence can be like a string of comma-separated pronouns with the object of the pronoun in the very last bit.

    Both of these cases will induce delays and still don't account for some problems of context. In my own personal experience, I've found spots where whole paragraphs really need to be translated if we want to keep the original meaning.

    Seriously, I think "Real-time" could be roughly translated :) as within a few seconds.

  51. Will the fansubs be available on torrent? by cylcyl · · Score: 1

    I look forward to them distributing all the translated content on torrent, as a part of the freedom of info act or something, so that we can get English subbed foreign TV programs. I think that the anime shares will be most popular

  52. Re:Awesome? WTF?? This... could... by davidsyes · · Score: 1

    be very suck. If you anything secret to say you better be hurry.

    --
    Previously: "Linux... Toward the Sunrise..." Now: "Linux... Toward the-- No, now, part of Every Sunrise"
  53. Babelfish: "There is a chisel in my dog." Me: WTF? by zooblethorpe · · Score: 1

    Seriously, try it. Input the sentence, "My dog has fleas." Go from English to Japanese, copy and paste the Japanese into the entry box, and translate back to English. "There is a chisel in my dog."

    Just one of many reasons that I'm not that worried about my career as a Japanese - English translator. :P

    --
    "What in the name of Fats Waller is that?"
    "A four-foot prune."
  54. If it could... by billdar · · Score: 1
    If it could translate what my 11 month old kid is babbling about, it would save both of us a lot of crying... And get me off the sauce.

    --
    I am billdar, and I approve this message.
  55. Poor grammar in target can be loss of meaning by zooblethorpe · · Score: 1

    As I just noted in another post, the current publicly available state of machine translation gives me little to fear as a professional translator. You note:

    ...with the only pitfall being poor grammer in the destination language...

    I'd like to point out that "poor grammar" can often have disastrous consequences for the meaning. Take my previous example, "My dog has fleas." Babelfish's Japanese output is "Watashi no inu ni nomi ga aru." This backtranslates to "There is a chisel in my dog." The bolded word here is the kicker -- aru means "is / has / to be" depending on context, but is reserved mostly for inanimate subjects. Nomi is the subject here, which could be either "flea(s)" or "chisel(s)". When using aru, it becomes a chisel, an inanimate. Using correct grammar, Babelfish should have used iru , "is / has / to be" for *animate* subjects.

    Let's take another example from Spanish. "I'm lost" should usually be translated as "Estoy perdido", using estoy to describe a transient state. With bad grammar, a translator (machine or otherwise) might use soy for permanent states, and produce "Soy perdido", which means something more like "I've lost my virginity." Woot!

    Granted, context can make a lot of this much clearer, but it's still awfully muddy when bad grammar is thrown into the mix. Your comment suggests that poor grammar is a minor problem, but I'd beg to differ -- bad grammar can radically alter the meaning on the one hand, and it requires a full understanding of the context and intent of the source text to produce an accurate target, which is no mean feat in programming terms.

    ...it's just a matter of translation...

    Which, happily enough given the complexity of the problem, leaves me pretty secure job-wise. :)

    Cheers,

    --
    "What in the name of Fats Waller is that?"
    "A four-foot prune."
  56. Re:Language parsing impossible by current technolo by inKubus · · Score: 1

    I think we need to sit down and think about analog computers again, using waves to add and subtract and create filters. Of course, this can be simulated in digital space with fourier transforms and stuff.

    --
    Cool! Amazing Toys.
  57. Oblig. by Anonymous Coward · · Score: 0

    In 10 years the opening Wikipedia article sentence will be:

    The DARPA Ultimate Language Translation Project by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation.

  58. OR.... by Catbeller · · Score: 1

    We could stop canning translators because they are gay (no liberty should be exempted from removal in our eternal war against evil, except the liberty from having gays look at your crotch, I guess).

    OR

    We could start learning some foreign languages. Everyone who graduates high school should learn at least two. Fluently. And no, not Spanish. A language NOT spoken by your neighbors. A *foreign* language. Arabic would be damned helpful.

    1. Re:OR.... by Dr.+Eggman · · Score: 1

      American high schools have big problems without having to worry about teaching kids two foriegn languages. Heck, we should atleast get geography as a nationally required core curriculum subject first.

      --
      Demented But Determined.
  59. "Pie in the sky" E-J-E = "The fleeting desire" by zooblethorpe · · Score: 1

    I'd agree, as there are two fields here that are extremely difficult -- voice recognition and machine translation -- which makes this all seem like so much pie in the sky. Anyone who's ever used voice recognition knows how spotty it can be, and anyone who's ever played with Babelfish (like this guy) knows how much humour can result. Now imagine these two lovely examples put to use on the battlefield, or at intel HQ, and some very unhappy possibilities arise.

    I'm all a fan of research for learning's sake, but the article here makes it sound like we're going to get something Real Soon Now (TM), which I seriously doubt. All of which seems to lend credence to your statement:

    It's really all about funding your academic buddies or whoever is going to be able to scratch you back in some way. It is very much an old boys network, with an emphasise on PR and not much about real science.

    So yeah, "wow that would be handy", but don't anyone hold their breath. :p

    --
    "What in the name of Fats Waller is that?"
    "A four-foot prune."
  60. Accuracy? Complexity? by Anonymous Coward · · Score: 0

    This is an interesting concept, and I'd be really interested to work on it, given that I'm a business analyst with an undergrad degree in Linguistics. However, I have to question how accurate this system would be, and if they really understand the complexity of the problem they're trying to solve. Cited is a current project (elsewhere):

    The device is dubbed the "Tower of Babel". Currently the device can handle a small vocabulary of 100-200 words at about 80% accuracy, and accuracy drops off significantly beyond that vocabulary.

    With a vocabulary this small, 80% accuracy is VERY BAD. Imagine if 2/10 of the words I use to communicate are UNINTELLIGIBLE. Perhaps you might *&^% I am trying to %&@@# you. I would live in great fear of a government that trusted such a system for spying. It might be a decent first cut at translation that could then be fixed by a translator, but could never, ever be very useful as intended (that is: "capable of real-time interpretation of television and radio programs").

    The problem is more solvable in "formal" speech (i.e. a news broadcast). The subjects and words are usually clear and idiom/jargon free. But imagine getting a computer to understand "Wayne's World" or any common sitcom, or a conversation between frieds. How do you translate "Doood, that party was goin' OFF last night"?

    There are other linguistic and cultural hurdles to overcome. Polish, for example, does not have separate words for "hand" and "arm". So, a literal translation (and remember, computers are VERY literal) from Polish would proably be "I broke my arm", which if stated in English could have been "I broke my hand". How about emotions? Some languages have many words for "love", whereas English (generally) only has one. When I say I love my brother, this is not meant as sexual statement, but I have to choose the correct word for "love" in the target language to keep it "brotherly".

    Which brings us to context. I may (in a twisted universe) actually mean that as a sexual statement. How do you ever expect a computer to pick up on the context (which often is not clear to a human) and immediately give a correct translation?

    To develop a literal, word-to-word translator is not a stunning achievement. Automating a language-to-language dictionary cuold be a first semester comp sci project. I suspect that this is basically what this project will produce. But if you want something better, it pays to delve into the world of Linguistics to understand how humans use language, starting from the phonemic level all the way up to semantic parsing and psycolinguistics. Take what you learn there, and apply it to a system like this and your results will be better. I'm not going to promise perfection, but certainly better than 80%.

    --Jonathan

  61. Effective translation might be impossible by EmbeddedJanitor · · Score: 1
    Many languages have words and terms that have no translation into other languages. Often a single word might translate into many words to get the same idea across. As a result, real-time translation becomes very difficult. For example, some languages have only one preposition and don't have words like on, under inside,....

    Then, of course there are cultural differences too. A Xhosa girl would like to be told "You remind me of that fat cow over there", whereas your average American chick might not.

    --
    Engineering is the art of compromise.
  62. Can it handle the subtlety of English nuances? by Anonymous Coward · · Score: 0
    Let me know when it can translate "Fuck the fucking fuckers".

  63. Not Going To Happen by Master+of+Transhuman · · Score: 1

    Requires conceptual processing which no one has solved yet.

    Fergeddaboutit.

    Until conceptual processing is able to be performed, ANY form of human language translation will be inadequate. It might be usable in some respects, but not adequate for most real purposes.

    --
    Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
  64. That going to be hard by Anonymous Coward · · Score: 0

    I thought the Universal Translators tech came after the hyperdrive one, or is this the one that only translates into an incomprehensible dead language? Joking aside I don't envy them the challenge, humans who speak the same language often have a hard enough time understanding each other.

  65. It does seem cool... by macmills · · Score: 1

    Although many may think this new technology is a bad idea, think about the communication unit in Star Trek. I know this is the real world and all, but advances like this can lead to a better understanding of each other. A unifying device like this can make views and beliefs from other cultures more understandable and somehow through this, we'll be able to make this world better in some small way.

    --
    If man must go to the moon then yes, he will go there....
  66. Can't Wait! by Anonymous Coward · · Score: 0

    I don't know if they'll pull this off, but if they do (and I can get access to it), I'll be able to pull out all the old albums, like some of Bob Dylan's, and actually enjoy the lyrics. "Ohhh! So that's what the song is about!"