DARPA Project Babylon: Universal Translator
silance writes "Take a look at this project from DARPA (Defense Advanced Research Projects Agency)! This time the boys are trying to hammer out a portable, two-way, real-time, multi-lingual audible speech translator proposed to be run on everything from PDA's to wearable military hardware to workstations (to replace their PRE-EXISTING ONE-WAY real-time hand-held audible translators, of course!). The site contains descriptions of technical approaches, a technical milestones timeline, and a nifty Power Point presentation for the executive-types ;) They should give William Shatner a beta model out of pure respect...
Here's a link to Google's cached HTML version of the Power Point presentation just in case. (P.S. - get a load of that logo at the bottom of the page!)"
Yeah, wasting an exorbitant amount of tax dollars, sure. Like the internet.
Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research. Get with it.
Naw, that never works. Here's an example:
English: Help, I caught my penis in a blender.
English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
German -> English: Help, I got caught my Penis in a mixing machine.
English -> Spanish: Ayude, yo cogió mi pene en un mezclador.
Spanish -> English: I help, took my penis in a mixer.
English -> Italian: Aiuti, io ha interferito il mio penis in un miscelatore.
Italian -> English: Aids, I have interfered with mine penis in a mixer.
English -> Portugese: Ajude, mim travou meu penis em um blender.
Portugese -> English: It helps, me stopped my penis in blender.
Compounding it doesn't help either:
English -> German: Helfen Sie, ich sich verfing meinen Penis in einer Mischmaschine.
German -> French: aidez moi verfing mes Penis dans un appareil de mélange.
French -> Spanish: me ayúde me verfing mi Penis en un aparato de mezcla.
Spanish -> English: me ayúde me verfing my Penis in a mixture apparatus.
Instead of using the term 'kicks ass' (which will translate as abusing a donkey...), use the term 'defeat'."
Which will translate as "I am going to chop off both of your feet."
True. Hoshi's fiddling with the universal translator really made me think about that piece of equipment we've been taking for granted in previous Star Treks.
Seems my university syntax and phonology courses weren't *that* useless after all...
The way I see it: suppose Chomsky's Universal Syntax turns out to be not innate to human brain structure, but to the very essence of communication. Meaning: if you're going to communicate something, all the forms you're going to be able to do it in will conform to a fairly basic set of ground rules and all the intricacies of natural languages are simply icing on the cake, as it were. If you figure out what that Universal Syntax is (sorry, I forgot the exact term he used - it's been a while, and my university education was in Dutch), you can feed that into a computer and teach it to reduce all phonemes from a given language to it. Then you can have the computer expand the basic message back into coherent communication in another language using the same basic rules.
It's late. And when it's late, this is the kind of stupid stuff I think about.
Oh, and I don't think Hoshi's *that* cute.
News and bla for computer musicians: http://lomechanik.net/
For those who don't know, an N-gram is data structure which encodes the statistics of word order in a language. These are used to greatly improve the accuracy of language pattern matchers such as speech recognition.
A typical speech recognizer might use a 3 word N-gram (tri-gram), which keeps track of all probable words which follow and thier likelyhoods. The probabilities are calculated by running terabytes of english text (books, magazines, internet chat boards) through a word counting program.
Thus, "green eggs and" will get a very high probability for "ham", but low for "jam", so it can bias a sound that seems to match "mam" acoustically to the more likely linquistic match "ham".
Yes, DARPA had one really great hit -- about 34 years ago.
Be cynical as you want, but DARPA is the one government agency which is really flexible and has a vision. With the rise of corporate dependency on innovation, even in the academic world, DARPA is one of the last bastions of basic research.
I can be awfully cynical about DARPA. My former employer's bread and butter was DARPA research. Which is to say that our primary products were proposals and billable hours. Many of those billable hours were spent documenting our activities -- presentations, review meetings, progress reports, final reports. Sometimes we had time for actual research, the direction of which changed with the whims of the DARPA program manager and was at best loosely correlated with the work proposed in the proposal. I'm not accusing my former employer of wrongdoing; that's the flavor of pointy hair induced by DARPA policies.
By the way, DARPA doesn't do basic research. In basic research, most of which is still done in universities, you give lip service to vague area of applications, but the real goal is understanding. DARPA's research goals are always applied -- i.e. the goal is always to produce something useful, not simply to understand the world. But it's "early R&D", farther from being applicable than most R&D, and too much of a long shot for most R&D organizations. The rule of thumb is that if nobody else in the Dept. of Defense thinks they know how to solve the problem, DARPA works on it. (This translator work seems to be an exception).
So most of DARPA's work is in the gap between basic research typical R&D. Ideas seem to get stuck in this gap for decades, which is why DARPA was created. But there's been too much pressure for short-term results for too long, so the agency is badly broken.
Doesn't anyone remember the addendum to the babelfish?
''Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloddier wars than anything else in the history of creation.''
we're doomed. I'm taking names for a bus to mars.
There are some people that if they don't know, you can't tell 'em.
I guess it's the most sickening yet use of the "terrorist" catch-word for getting public support.
This is quite offensive.
-twb
I am impressed with the attempt to try to get a two way translator packed into a little box, but I don't think it's going to be much of a success. I gather the sudden need for computational translation is because the military simply has too few people who speak the languages of the areas that they cover. I also assume that this is in direct relation to the FBI/CIA etc requesting Pashto and Arabic speakers to come forward and help them after 9/11 last year and the difficulties in understanding a lot of the folk in Afghanistan who speak three major different languages (Pashto, Dari and Uzbek) with a whole bunch of dialects.
Sadly I think that it will be a waste of time. I speak six languages and at least one of them, Swiss-German, is not even a written language and here in Switzerland there about three major dialects of the language, some of which are not 100% mutually intelligible, and this in a Swiss-German population of about 5 million. I think that this system will run into the same sort of problems with languages like Arabic which has enormous dialectic variations in dialects say, from Algeria to Syria and people from the various areas can often not understand one another well. No one speaks classical Arabic of the Quran in day to day language use.
My guess is that the Military/CIA etc would be better advised to simply get people to learn the languages and to train others in using day to day expressions. This would have, amongst other things , the positive side effect that soldiers (some of them at least) would be better able to understand the culture and the situation of the local people where they are stationed. Not only this but people in all the countries I've lived in have reacted much, much better to me when I've tried to learn their language instead of being the usual culturally ignorant Anglo Tourist who expects everyone to speak English. I would argue that the general western ignorance (especially amongst English speakers) is one of the causes of the percieved arrogance seen by many third worlders. Another positive effect of learning the languages would be that there would be someone who would understand slang, as I think there's nothing like a bit of slang to throw off any translation software.