DARPA Project Babylon: Universal Translator
silance writes "Take a look at this project from DARPA (Defense Advanced Research Projects Agency)! This time the boys are trying to hammer out a portable, two-way, real-time, multi-lingual audible speech translator proposed to be run on everything from PDA's to wearable military hardware to workstations (to replace their PRE-EXISTING ONE-WAY real-time hand-held audible translators, of course!). The site contains descriptions of technical approaches, a technical milestones timeline, and a nifty Power Point presentation for the executive-types ;) They should give William Shatner a beta model out of pure respect...
Here's a link to Google's cached HTML version of the Power Point presentation just in case. (P.S. - get a load of that logo at the bottom of the page!)"
The next big thing I think would be a "smart" translator that can do pattern recognition and "learn" as it gets more of the language. IIRC This is how the star trek translators work.
Kind of the difference between pattern checking, and anomaly detection in virus scanners.
Well, someone had to say it.
Visit me on #weirdness on the Galaxynet.
Yeah, wasting an exorbitant amount of tax dollars, sure. Like the internet.
Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research. Get with it.
Naw, that never works. Here's an example:
English: Help, I caught my penis in a blender.
English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
German -> English: Help, I got caught my Penis in a mixing machine.
English -> Spanish: Ayude, yo cogió mi pene en un mezclador.
Spanish -> English: I help, took my penis in a mixer.
English -> Italian: Aiuti, io ha interferito il mio penis in un miscelatore.
Italian -> English: Aids, I have interfered with mine penis in a mixer.
English -> Portugese: Ajude, mim travou meu penis em um blender.
Portugese -> English: It helps, me stopped my penis in blender.
Compounding it doesn't help either:
English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
German -> French: aidez moi verfing mes Penis dans un appareil de mélange.
French -> Spanish: me ayúde me verfing mi Penis en un aparato de mezcla.
Spanish -> English: me ayúde me verfing my Penis in a mixture apparatus.
The task goal is to produce ten working two-way prototypes from each of four teams by the end of 18-months. The languages that will be translated are Farsi, Dari, Arabic, Pashto, Mandarin, and Uzbeki.
DARPA might as well say:
The task goal is to produce a working two-way prototype from each of four teams by the end of 18-months. The languages that will be translated are English and Godless Terrorist.
cpeterso
Does this set off alarm bells for anyone? Those are complicated languages, and I believe Mandarin in particular is EXTREMELY tonal (i.e., doesn't work well in speech recognition).
It is an interesting choice of languages for two reasons
Sailing over the event horizon
I'd much rather see them give it to Linda Park (Hoshi Sato on 'Enterprise'). She's the one who really made the universal translators famous. On TOS, the concept was mostly ignored ("They always worked perfectly -- Yeah! That's the story!"). On Enterprise, she does the translating almost as often as the translator does.
Besides, I'd much rather see her recieving the thing in a newscast than Shatner (she's cuter!).
Sometimes boldness is in fashion. Sometimes only the brave will be bold.
For those who don't know, an N-gram is data structure which encodes the statistics of word order in a language. These are used to greatly improve the accuracy of language pattern matchers such as speech recognition.
A typical speech recognizer might use a 3 word N-gram (tri-gram), which keeps track of all probable words which follow and thier likelyhoods. The probabilities are calculated by running terabytes of english text (books, magazines, internet chat boards) through a word counting program.
Thus, "green eggs and" will get a very high probability for "ham", but low for "jam", so it can bias a sound that seems to match "mam" acoustically to the more likely linquistic match "ham".
Yes, DARPA had one really great hit -- about 34 years ago.
Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research.
I can be awfully cynical about DARPA. My former employer's bread and butter was DARPA research. Which is to say that our primary products were proposals and billable hours. Many of those billable hours were spent documenting our activities -- presentations, review meetings, progress reports, final reports. Sometimes we had time for actual research, the direction of which changed with the whims of the DARPA program manager and was at best loosely correlated with the work proposed in the proposal. I'm not accusing my former employer of wrongdoing; that's the flavor of pointy hair induced by DARPA policies.
By the way, DARPA doesn't do basic research. In basic research, most of which is still done in universities, you give lip service to vague area of applications, but the real goal is understanding. DARPA's research goals are always applied -- i.e. the goal is always to produce something useful, not simply to understand the world. But it's "early R&D", farther from being applicable than most R&D, and too much of a long shot for most R&D organizations. The rule of thumb is that if nobody else in the Dept. of Defense thinks they know how to solve the problem, DARPA works on it. (This translator work seems to be an exception).
So most of DARPA's work is in the gap between basic research typical R&D. Ideas seem to get stuck in this gap for decades, which is why DARPA was created. But there's been too much pressure for short-term results for too long, so the agency is badly broken.
Doesn't anyone remember the addendum to the babelfish?
''Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloddier wars than anything else in the history of creation.''
we're doomed. I'm taking names for a bus to mars.
There are some people that if they don't know, you can't tell 'em.
I guess it's the most sickening yet use of the "terrorist" catch-word for getting public support.
This is quite offensive.
-twb
I am impressed with the attempt to try to get a two way translator packed into a little box, but I don't think it's going to be much of a success. I gather the sudden need for computational translation is because the military simply has too few people who speak the languages of the areas that they cover. I also assume that this is in direct relation to the FBI/CIA etc requesting Pashto and Arabic speakers to come forward and help them after 9/11 last year and the difficulties in understanding a lot of the folk in Afghanistan who speak three major different languages (Pashto, Dari and Uzbek) with a whole bunch of dialects.
Sadly I think that it will be a waste of time. I speak six languages and at least one of them, Swiss-German, is not even a written language and here in Switzerland there about three major dialects of the language, some of which are not 100% mutually intelligible, and this in a Swiss-German population of about 5 million. I think that this system will run into the same sort of problems with languages like Arabic which has enormous dialectic variations in dialects say, from Algeria to Syria and people from the various areas can often not understand one another well. No one speaks classical Arabic of the Quran in day to day language use.
My guess is that the Military/CIA etc would be better advised to simply get people to learn the languages and to train others in using day to day expressions. This would have, amongst other things , the positive side effect that soldiers (some of them at least) would be better able to understand the culture and the situation of the local people where they are stationed. Not only this but people in all the countries I've lived in have reacted much, much better to me when I've tried to learn their language instead of being the usual culturally ignorant Anglo Tourist who expects everyone to speak English. I would argue that the general western ignorance (especially amongst English speakers) is one of the causes of the percieved arrogance seen by many third worlders. Another positive effect of learning the languages would be that there would be someone who would understand slang, as I think there's nothing like a bit of slang to throw off any translation software.
The task goal is to produce a working two-way prototype from each of four teams by the end of 18-months. The languages that will be translated are English and Godless Terrorist.
Incorrect, and unfair. Many of the "Northern Alliance" spoke Pashto and/or Dari (which is a dialect of Farsi). Uzbekistan let us use their military bases during the invasion of Afghanistan. And several of our allies, both real and on paper, speak Arabic.
This is not a "English vs. Godless Terrorist" issue, as you say. The simple fact is there is a dearth of US military personnel that speak these languages, and we have an urgent need, now more than ever, to communicate with people who speak these languages. We do indeed have to spy on our enemies that speak in these tongues, but we also have to accurately share information and intelligence with our allies.
--Mythos