Slashdot Mirror


DARPA Project Babylon: Universal Translator

silance writes "Take a look at this project from DARPA (Defense Advanced Research Projects Agency)! This time the boys are trying to hammer out a portable, two-way, real-time, multi-lingual audible speech translator proposed to be run on everything from PDA's to wearable military hardware to workstations (to replace their PRE-EXISTING ONE-WAY real-time hand-held audible translators, of course!). The site contains descriptions of technical approaches, a technical milestones timeline, and a nifty Power Point presentation for the executive-types ;) They should give William Shatner a beta model out of pure respect... Here's a link to Google's cached HTML version of the Power Point presentation just in case. (P.S. - get a load of that logo at the bottom of the page!)"

24 of 324 comments (clear)

  1. pattern recognition? by Telastyn · · Score: 4, Interesting

    The next big thing I think would be a "smart" translator that can do pattern recognition and "learn" as it gets more of the language. IIRC This is how the star trek translators work.

    Kind of the difference between pattern checking, and anomaly detection in virus scanners.

    1. Re:pattern recognition? by NanoGator · · Score: 4, Insightful

      "The real problem is that semantics (meaning) is more difficult to translate between languages"

      Agreed. What will probably happen is that people will initially have to be trained to use these machines. "Instead of using the term 'kicks ass' (which will translate as abusing a donkey...), use the term 'defeat'."

      --
      "Derp de derp."
    2. Re:pattern recognition? by Phanatic1a · · Score: 5, Funny

      Instead of using the term 'kicks ass' (which will translate as abusing a donkey...), use the term 'defeat'."


      Which will translate as "I am going to chop off both of your feet."

  2. Obligatory B5 reference by Eimi+Metamorphoumai · · Score: 4, Funny
    The Babylon Project was our last, best hope for peace.

    Well, someone had to say it.

    --

    Visit me on #weirdness on the Galaxynet.

  3. Re:DARPA Project Babylon by rmohr02 · · Score: 5, Funny
    Here's a link to Google's cached HTML version of the Power Point presentation just in case.
    Darn--I could've done some decent karmawhoring giving out that information.
  4. Re:Shall we unleash the Kitty?, YES! by martyn+s · · Score: 5, Insightful

    Yeah, wasting an exorbitant amount of tax dollars, sure. Like the internet.

    Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research. Get with it.

  5. Strikes me as fishy (pun unintended) by Seth+Finkelstein · · Score: 3, Insightful
    Look at this passage in http://www.darpa.mil/ipto/research/babylon/approac h.html
    (emphasis added)

    The task goal is to produce ten working two-way prototypes from each of four teams by the end of 18-months. The languages that will be translated are Farsi, Dari, Arabic, Pashto, Mandarin, and Uzbeki.

    Does this set off alarm bells for anyone? Those are complicated languages, and I believe Mandarin in particular is EXTREMELY tonal (i.e., doesn't work well in speech recognition).

    Look, just imagine which you get out of Babelfish. Now take it a few levels up, to speech. Does this proposal in any way sound achievable? (again, pun unintended)

    Sig: What Happened To The Censorware Project (censorware.org)

    1. Re:Strikes me as fishy (pun unintended) by cpeterso · · Score: 4, Funny


      The task goal is to produce ten working two-way prototypes from each of four teams by the end of 18-months. The languages that will be translated are Farsi, Dari, Arabic, Pashto, Mandarin, and Uzbeki.


      DARPA might as well say:

      The task goal is to produce a working two-way prototype from each of four teams by the end of 18-months. The languages that will be translated are English and Godless Terrorist.

    2. Re:Strikes me as fishy (pun unintended) by gwernol · · Score: 4, Insightful
      The languages that will be translated are Farsi, Dari, Arabic, Pashto, Mandarin, and Uzbeki.

      Does this set off alarm bells for anyone? Those are complicated languages, and I believe Mandarin in particular is EXTREMELY tonal (i.e., doesn't work well in speech recognition).


      It is an interesting choice of languages for two reasons
      • As you note these are difficult languages to tackle. However this is the defense advanced research project agency. Their mission is to push the edge of what is technically possible and encourage new research. You do this by picking hard problems that haven't yet been solved.

      • Look at the countries involved. Twenty years ago this list would have been headed by Russian. For better or worse, it very much reflects (IMHO) the countries currently posing a threat or potential threat to the US.
      --
      Sailing over the event horizon
    3. Re:Strikes me as fishy (pun unintended) by MythosTraecer · · Score: 4, Insightful

      The task goal is to produce a working two-way prototype from each of four teams by the end of 18-months. The languages that will be translated are English and Godless Terrorist.

      Incorrect, and unfair. Many of the "Northern Alliance" spoke Pashto and/or Dari (which is a dialect of Farsi). Uzbekistan let us use their military bases during the invasion of Afghanistan. And several of our allies, both real and on paper, speak Arabic.

      This is not a "English vs. Godless Terrorist" issue, as you say. The simple fact is there is a dearth of US military personnel that speak these languages, and we have an urgent need, now more than ever, to communicate with people who speak these languages. We do indeed have to spy on our enemies that speak in these tongues, but we also have to accurately share information and intelligence with our allies.

      --

      --Mythos
  6. Concerns... by NanoGator · · Score: 3, Interesting

    It's cool that they're working on this and all, but their promises of building these into PDA's set off a flag in my mind. There's another company that, as of a couple of years ago, had developed a realtime program that allows one to speak english into a mic and have spoken japanese come out.

    I remember reading that they needed serious processing power and RAM to make this work. (At least 512 megs...) It seems like if one language takes up this amount of resources, then it'll be a while before we have a multi-lingual PDA...

    Maybe their technique is different? I dunno. I know it's not the same company.

    I guess I'm just concerned about this being vaporware.

    --
    "Derp de derp."
  7. The USMC model by r_j_prahad · · Score: 3, Funny

    So... now every USMC ground-pounder will be able to say "die, motherfucker, die" in 32 different languages?

    Awesome.

  8. Nice by Auckerman · · Score: 3, Interesting

    Initial impression: boy are they in a hurry. Very aggressive time table for this project. 6 Months to "Emergency DARPA", 18 Months to 3 functional prototypes.

    Then I saw what languages it will have: Arabic , Mandarin (the part of china that border Pakistan and India is mainly Islamic), Pashto (Pakistan/Afganistan), Dari (Iran/Afgan/etc)

    Oh. What I want to know is what those 8 other languages are that they want to have the ability to add to it later?

    --

    Burn Hollywood Burn
  9. Re:Feedback loop? by cscx · · Score: 5, Funny

    Naw, that never works. Here's an example:

    English: Help, I caught my penis in a blender.

    English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
    German -> English: Help, I got caught my Penis in a mixing machine.

    English -> Spanish: Ayude, yo cogió mi pene en un mezclador.
    Spanish -> English: I help, took my penis in a mixer.

    English -> Italian: Aiuti, io ha interferito il mio penis in un miscelatore.
    Italian -> English: Aids, I have interfered with mine penis in a mixer.

    English -> Portugese: Ajude, mim travou meu penis em um blender.
    Portugese -> English: It helps, me stopped my penis in blender.

    Compounding it doesn't help either:

    English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
    German -> French: aidez moi verfing mes Penis dans un appareil de mélange.
    French -> Spanish: me ayúde me verfing mi Penis en un aparato de mezcla.
    Spanish -> English: me ayúde me verfing my Penis in a mixture apparatus.

  10. Give it to Hoshi by darkonc · · Score: 4, Insightful
    They should give William Shatner a beta model out of pure respect...

    I'd much rather see them give it to Linda Park (Hoshi Sato on 'Enterprise'). She's the one who really made the universal translators famous. On TOS, the concept was mostly ignored ("They always worked perfectly -- Yeah! That's the story!"). On Enterprise, she does the translating almost as often as the translator does.

    Besides, I'd much rather see her recieving the thing in a newscast than Shatner (she's cuter!).

    --
    Sometimes boldness is in fashion. Sometimes only the brave will be bold.
    1. Re:Give it to Hoshi by uebernewby · · Score: 5, Interesting

      True. Hoshi's fiddling with the universal translator really made me think about that piece of equipment we've been taking for granted in previous Star Treks.

      Seems my university syntax and phonology courses weren't *that* useless after all...

      The way I see it: suppose Chomsky's Universal Syntax turns out to be not innate to human brain structure, but to the very essence of communication. Meaning: if you're going to communicate something, all the forms you're going to be able to do it in will conform to a fairly basic set of ground rules and all the intricacies of natural languages are simply icing on the cake, as it were. If you figure out what that Universal Syntax is (sorry, I forgot the exact term he used - it's been a while, and my university education was in Dutch), you can feed that into a computer and teach it to reduce all phonemes from a given language to it. Then you can have the computer expand the basic message back into coherent communication in another language using the same basic rules.

      It's late. And when it's late, this is the kind of stupid stuff I think about.

      Oh, and I don't think Hoshi's *that* cute.

      --

      News and bla for computer musicians: http://lomechanik.net/
    2. Re:Give it to Hoshi by darkonc · · Score: 3, Informative
      Oh, and I don't think Hoshi's *that* cute.

      Compared to Shatner?? Are you crazy?

      Actually, if you watch closely, she hasn't quite got the bust of T'Pol, and she rarely gets the sexy scenes, but she's still quit nice... and far more attractive (in my mind) than Shatner.

      If you remember the scene in the (one of) the first episode where Hoshi, T'Pol and Tucker are resting in decon, I thought both of the women were pretty nice.

      --
      Sometimes boldness is in fashion. Sometimes only the brave will be bold.
  11. interesting logo, but bad. by Artifex · · Score: 3, Interesting

    The "Babylonian" reference may at first seem apt: the towers were built 'to the heavens' (well, pretty high) and a lack of communication and understanding among peoples led to their downfall.

    However, the underlying, unspoken subtext of a comparison between us and Babylon is that we displeased God. Remember, in the Bible at least (there's other versions in other histories/religions), God was displeased, and the language confusion among the peoples was caused in order to bring us down.

    What this logo basically tells the world (or at least those who have an understanding of the mythos) isn't that we're a great nation and metter communication would have helped us - it's that we went against God, and this is how we paid.

    This sounds a lot like those right-wing extremists who tried to blame the attack on 'communists' and homosexuals in our country making God upset.

    Now, I feel, like many people do, that our country has done a great many things wrong: setting policy based on oil needs and not human rights, keeping some smaller countries' governments (including some democracies) destabilized in order to serve our own interests, etc. However, just as I don't think that we can claim "God is on our side," neither do I think anyone can claim that God isn't.

    This logo is offensive. That it shows the half-thought-out mentality of some of the people in charge at our governmental agencies should be a cause for alarm, not applause. We have been called Babylon by many people with grievances against us, and it seems our leaders are reveling in the name.

    --
    Get off my launchpad!
  12. N-Gram by rufusdufus · · Score: 5, Interesting

    For those who don't know, an N-gram is data structure which encodes the statistics of word order in a language. These are used to greatly improve the accuracy of language pattern matchers such as speech recognition.
    A typical speech recognizer might use a 3 word N-gram (tri-gram), which keeps track of all probable words which follow and thier likelyhoods. The probabilities are calculated by running terabytes of english text (books, magazines, internet chat boards) through a word counting program.
    Thus, "green eggs and" will get a very high probability for "ham", but low for "jam", so it can bias a sound that seems to match "mam" acoustically to the more likely linquistic match "ham".

  13. Flexible? Basic Research? DARPA? by upper · · Score: 5, Informative
    Yeah, wasting an exorbitant amount of tax dollars, sure. Like the internet.

    Yes, DARPA had one really great hit -- about 34 years ago.

    Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research.

    I can be awfully cynical about DARPA. My former employer's bread and butter was DARPA research. Which is to say that our primary products were proposals and billable hours. Many of those billable hours were spent documenting our activities -- presentations, review meetings, progress reports, final reports. Sometimes we had time for actual research, the direction of which changed with the whims of the DARPA program manager and was at best loosely correlated with the work proposed in the proposal. I'm not accusing my former employer of wrongdoing; that's the flavor of pointy hair induced by DARPA policies.

    By the way, DARPA doesn't do basic research. In basic research, most of which is still done in universities, you give lip service to vague area of applications, but the real goal is understanding. DARPA's research goals are always applied -- i.e. the goal is always to produce something useful, not simply to understand the world. But it's "early R&D", farther from being applicable than most R&D, and too much of a long shot for most R&D organizations. The rule of thumb is that if nobody else in the Dept. of Defense thinks they know how to solve the problem, DARPA works on it. (This translator work seems to be an exception).

    So most of DARPA's work is in the gap between basic research typical R&D. Ideas seem to get stuck in this gap for decades, which is why DARPA was created. But there's been too much pressure for short-term results for too long, so the agency is badly broken.

  14. It's the end of the world as we know it by Darth_brooks · · Score: 5, Funny

    Doesn't anyone remember the addendum to the babelfish?


    ''Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloddier wars than anything else in the history of creation.''


    we're doomed. I'm taking names for a bus to mars.

    --
    There are some people that if they don't know, you can't tell 'em.
  15. Re:umm the logo by lostchicken · · Score: 5, Insightful

    I guess it's the most sickening yet use of the "terrorist" catch-word for getting public support.

    This is quite offensive.

    --
    -twb
  16. Re:mmm...... by AntiNorm · · Score: 3, Interesting

    or maybe haXor to newbie

    Google already does haXor, so maybe this isn't so far off.

    --

    I pledge allegiance to the flag...
    of the Corporate States of America...
  17. Translation, slang and learning the language by theolein · · Score: 5, Insightful

    I am impressed with the attempt to try to get a two way translator packed into a little box, but I don't think it's going to be much of a success. I gather the sudden need for computational translation is because the military simply has too few people who speak the languages of the areas that they cover. I also assume that this is in direct relation to the FBI/CIA etc requesting Pashto and Arabic speakers to come forward and help them after 9/11 last year and the difficulties in understanding a lot of the folk in Afghanistan who speak three major different languages (Pashto, Dari and Uzbek) with a whole bunch of dialects.

    Sadly I think that it will be a waste of time. I speak six languages and at least one of them, Swiss-German, is not even a written language and here in Switzerland there about three major dialects of the language, some of which are not 100% mutually intelligible, and this in a Swiss-German population of about 5 million. I think that this system will run into the same sort of problems with languages like Arabic which has enormous dialectic variations in dialects say, from Algeria to Syria and people from the various areas can often not understand one another well. No one speaks classical Arabic of the Quran in day to day language use.

    My guess is that the Military/CIA etc would be better advised to simply get people to learn the languages and to train others in using day to day expressions. This would have, amongst other things , the positive side effect that soldiers (some of them at least) would be better able to understand the culture and the situation of the local people where they are stationed. Not only this but people in all the countries I've lived in have reacted much, much better to me when I've tried to learn their language instead of being the usual culturally ignorant Anglo Tourist who expects everyone to speak English. I would argue that the general western ignorance (especially amongst English speakers) is one of the causes of the percieved arrogance seen by many third worlders. Another positive effect of learning the languages would be that there would be someone who would understand slang, as I think there's nothing like a bit of slang to throw off any translation software.