DARPA Project Babylon: Universal Translator
silance writes "Take a look at this project from DARPA (Defense Advanced Research Projects Agency)! This time the boys are trying to hammer out a portable, two-way, real-time, multi-lingual audible speech translator proposed to be run on everything from PDA's to wearable military hardware to workstations (to replace their PRE-EXISTING ONE-WAY real-time hand-held audible translators, of course!). The site contains descriptions of technical approaches, a technical milestones timeline, and a nifty Power Point presentation for the executive-types ;) They should give William Shatner a beta model out of pure respect...
Here's a link to Google's cached HTML version of the Power Point presentation just in case. (P.S. - get a load of that logo at the bottom of the page!)"
The next big thing I think would be a "smart" translator that can do pattern recognition and "learn" as it gets more of the language. IIRC This is how the star trek translators work.
Kind of the difference between pattern checking, and anomaly detection in virus scanners.
Well, someone had to say it.
Visit me on #weirdness on the Galaxynet.
Yeah, wasting an exorbitant amount of tax dollars, sure. Like the internet.
Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research. Get with it.
(emphasis added)
Does this set off alarm bells for anyone? Those are complicated languages, and I believe Mandarin in particular is EXTREMELY tonal (i.e., doesn't work well in speech recognition).
Look, just imagine which you get out of Babelfish. Now take it a few levels up, to speech. Does this proposal in any way sound achievable? (again, pun unintended)
Sig: What Happened To The Censorware Project (censorware.org)
It's cool that they're working on this and all, but their promises of building these into PDA's set off a flag in my mind. There's another company that, as of a couple of years ago, had developed a realtime program that allows one to speak english into a mic and have spoken japanese come out.
I remember reading that they needed serious processing power and RAM to make this work. (At least 512 megs...) It seems like if one language takes up this amount of resources, then it'll be a while before we have a multi-lingual PDA...
Maybe their technique is different? I dunno. I know it's not the same company.
I guess I'm just concerned about this being vaporware.
"Derp de derp."
So... now every USMC ground-pounder will be able to say "die, motherfucker, die" in 32 different languages?
Awesome.
Initial impression: boy are they in a hurry. Very aggressive time table for this project. 6 Months to "Emergency DARPA", 18 Months to 3 functional prototypes.
Then I saw what languages it will have: Arabic , Mandarin (the part of china that border Pakistan and India is mainly Islamic), Pashto (Pakistan/Afganistan), Dari (Iran/Afgan/etc)
Oh. What I want to know is what those 8 other languages are that they want to have the ability to add to it later?
Burn Hollywood Burn
Naw, that never works. Here's an example:
English: Help, I caught my penis in a blender.
English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
German -> English: Help, I got caught my Penis in a mixing machine.
English -> Spanish: Ayude, yo cogió mi pene en un mezclador.
Spanish -> English: I help, took my penis in a mixer.
English -> Italian: Aiuti, io ha interferito il mio penis in un miscelatore.
Italian -> English: Aids, I have interfered with mine penis in a mixer.
English -> Portugese: Ajude, mim travou meu penis em um blender.
Portugese -> English: It helps, me stopped my penis in blender.
Compounding it doesn't help either:
English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
German -> French: aidez moi verfing mes Penis dans un appareil de mélange.
French -> Spanish: me ayúde me verfing mi Penis en un aparato de mezcla.
Spanish -> English: me ayúde me verfing my Penis in a mixture apparatus.
I'd much rather see them give it to Linda Park (Hoshi Sato on 'Enterprise'). She's the one who really made the universal translators famous. On TOS, the concept was mostly ignored ("They always worked perfectly -- Yeah! That's the story!"). On Enterprise, she does the translating almost as often as the translator does.
Besides, I'd much rather see her recieving the thing in a newscast than Shatner (she's cuter!).
Sometimes boldness is in fashion. Sometimes only the brave will be bold.
The "Babylonian" reference may at first seem apt: the towers were built 'to the heavens' (well, pretty high) and a lack of communication and understanding among peoples led to their downfall.
However, the underlying, unspoken subtext of a comparison between us and Babylon is that we displeased God. Remember, in the Bible at least (there's other versions in other histories/religions), God was displeased, and the language confusion among the peoples was caused in order to bring us down.
What this logo basically tells the world (or at least those who have an understanding of the mythos) isn't that we're a great nation and metter communication would have helped us - it's that we went against God, and this is how we paid.
This sounds a lot like those right-wing extremists who tried to blame the attack on 'communists' and homosexuals in our country making God upset.
Now, I feel, like many people do, that our country has done a great many things wrong: setting policy based on oil needs and not human rights, keeping some smaller countries' governments (including some democracies) destabilized in order to serve our own interests, etc. However, just as I don't think that we can claim "God is on our side," neither do I think anyone can claim that God isn't.
This logo is offensive. That it shows the half-thought-out mentality of some of the people in charge at our governmental agencies should be a cause for alarm, not applause. We have been called Babylon by many people with grievances against us, and it seems our leaders are reveling in the name.
Get off my launchpad!
For those who don't know, an N-gram is data structure which encodes the statistics of word order in a language. These are used to greatly improve the accuracy of language pattern matchers such as speech recognition.
A typical speech recognizer might use a 3 word N-gram (tri-gram), which keeps track of all probable words which follow and thier likelyhoods. The probabilities are calculated by running terabytes of english text (books, magazines, internet chat boards) through a word counting program.
Thus, "green eggs and" will get a very high probability for "ham", but low for "jam", so it can bias a sound that seems to match "mam" acoustically to the more likely linquistic match "ham".
Yes, DARPA had one really great hit -- about 34 years ago.
Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research.
I can be awfully cynical about DARPA. My former employer's bread and butter was DARPA research. Which is to say that our primary products were proposals and billable hours. Many of those billable hours were spent documenting our activities -- presentations, review meetings, progress reports, final reports. Sometimes we had time for actual research, the direction of which changed with the whims of the DARPA program manager and was at best loosely correlated with the work proposed in the proposal. I'm not accusing my former employer of wrongdoing; that's the flavor of pointy hair induced by DARPA policies.
By the way, DARPA doesn't do basic research. In basic research, most of which is still done in universities, you give lip service to vague area of applications, but the real goal is understanding. DARPA's research goals are always applied -- i.e. the goal is always to produce something useful, not simply to understand the world. But it's "early R&D", farther from being applicable than most R&D, and too much of a long shot for most R&D organizations. The rule of thumb is that if nobody else in the Dept. of Defense thinks they know how to solve the problem, DARPA works on it. (This translator work seems to be an exception).
So most of DARPA's work is in the gap between basic research typical R&D. Ideas seem to get stuck in this gap for decades, which is why DARPA was created. But there's been too much pressure for short-term results for too long, so the agency is badly broken.
Doesn't anyone remember the addendum to the babelfish?
''Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloddier wars than anything else in the history of creation.''
we're doomed. I'm taking names for a bus to mars.
There are some people that if they don't know, you can't tell 'em.
I guess it's the most sickening yet use of the "terrorist" catch-word for getting public support.
This is quite offensive.
-twb
or maybe haXor to newbie
Google already does haXor, so maybe this isn't so far off.
I pledge allegiance to the flag...
of the Corporate States of America...
I am impressed with the attempt to try to get a two way translator packed into a little box, but I don't think it's going to be much of a success. I gather the sudden need for computational translation is because the military simply has too few people who speak the languages of the areas that they cover. I also assume that this is in direct relation to the FBI/CIA etc requesting Pashto and Arabic speakers to come forward and help them after 9/11 last year and the difficulties in understanding a lot of the folk in Afghanistan who speak three major different languages (Pashto, Dari and Uzbek) with a whole bunch of dialects.
Sadly I think that it will be a waste of time. I speak six languages and at least one of them, Swiss-German, is not even a written language and here in Switzerland there about three major dialects of the language, some of which are not 100% mutually intelligible, and this in a Swiss-German population of about 5 million. I think that this system will run into the same sort of problems with languages like Arabic which has enormous dialectic variations in dialects say, from Algeria to Syria and people from the various areas can often not understand one another well. No one speaks classical Arabic of the Quran in day to day language use.
My guess is that the Military/CIA etc would be better advised to simply get people to learn the languages and to train others in using day to day expressions. This would have, amongst other things , the positive side effect that soldiers (some of them at least) would be better able to understand the culture and the situation of the local people where they are stationed. Not only this but people in all the countries I've lived in have reacted much, much better to me when I've tried to learn their language instead of being the usual culturally ignorant Anglo Tourist who expects everyone to speak English. I would argue that the general western ignorance (especially amongst English speakers) is one of the causes of the percieved arrogance seen by many third worlders. Another positive effect of learning the languages would be that there would be someone who would understand slang, as I think there's nothing like a bit of slang to throw off any translation software.