Speaking in Tongues
Desert1 writes "Carnegie Mellon's renowned computer science department has developed a system which allows for conversation between two different languages called Tongues. Currently this has been used between Croatian and English, perhaps one day they will be able to develop one that will allow politicians to talk to normal folks and be understood." It's been in development for a while.
...is handwriting recognition that can handle Doctor's handwriting.
until it can allow h@x0r5 and non-"l33t"s to communicate?
I never spellcheck and I freely admit it. Save your karma for more worthwhile "lol erorrs" replies
Then again, you need to understand Holyspiritish before you can write the translator.
http://pcblues.com - Digits and Wood
Looks like a fascinating project --- I wonder if their Vision and Robotics boys are working on recognizing sign language which, for all intents and purposes, seems to be a very much more difficult problem (don't believe me --- see how well the facial recognition packages do in production environments :-P). I wonder if this is at the stage where it could be attached to a something like a virtual {insert sign language of your choice here} "translator"... hrm, sounds like a summer project ;-)
I've long wondered why someone doesn't just brute force translation.
Create a human translated database of damn near EVERYTHING in two languages, like English and Spanish. Then, just do fast lookups.
Computing power is such that this would be possible.
Learning HOW to think is more important than learning WHAT to think.
if you arent satisfied with the pc magazine summary, you can read this
In the pre-computer days, some folks noticed that a neophyte (basic idea, needs dictionary)translation into Esperanto was much more comprehended at the other end than a neophyte translation to the destination language or a neophyte translation by the recipient.
The reasoning was that the process of translating into a more formal mechanical language clarified and codified ideas.
Once again, it's the dividing line between human and machine that's the problem. Millions of people train themselves to C or the shells. Fewer to assembly. But it takes some wetware work to push the human/computer boundary closer to the computer.
Like most programming has a learning curve, usually less than ASM, leaving language translation completely to the machine will be fraught and ambiguous. Good translation requires some push from normal speech, but maybe not so far as mastering every other possible language...
So, basically, it's a lookup function, translating the incomming speech and then comparing in a database... So, while they could have a huge dictionary that could cover most situations, they aren't really doing a 'translation' per say...
Although, then again, for anyone who has taken language classes, but are not fluent in the second language, isn't that what we do? I know that while I was taking French and Latin, to come up with phrases I would do phrase translations because I was still thinking in English. I wasn't fluent enough to think in those other languages, so I couldn't formulate phrases directly properly.
I suppose, in essence, this will work as a translator, but it is neither a babel-fish type universal translator nor is it any replacement for fluency.
Still cool, though. Now, can they get it to run on a Palm?
-T
Oh great, just what we need: a machine/program that makes it easier for us to snow crash. I'd like to play with this a bit, and find out where it's rough edges are--especially running translated output back through, a la the Babelfish.
404 Error:
I dunno if computer translation is going to be up to par for a long time.
I speak both Spanish and English. English is native and Spanish is due to 3 years in South America. And my grandparents are from Spain. I did not really know anything until I lived in Colombia and my granny who has Phd in her own language was a pretty harsh mistress. I was 21 years old when I learned. Of course living with a Colombian sysadmin girl for two years was a big help. She liked the Penguin.
Languages differ too much from location to location. Justlike English in regions in the US. I am from New Orleans and the english changes from neighborhood to neoghborhood.
Word meanings and expressions might be exactly the same in spelling and sound but mean different things to different people.
To build these variables into software would be a *HUGE* task.
I think the best we could hope for is software that does a decent brute translation and then a human does the final edit.
The problem is one word might be ok to use in Puerto Rico(well they are confused about which language they speak) but socially unacceptable in Colombia. Software cannot know the difference.
People will always do the translation gig better.
Puto
Course my handle is pretty bad in any Latin country.
The Revolution Will Not Be Televised
I think it is very interesting that it works by using phrases rather than individual words. Most translators in the past have used words and that leaves room for error with idiomatic phrases such as "window shopping" (the french equivalent translates as "window licking").
Maybe it would be a good idea to put something on the web and let us test it, at least without the speech components.
You call me a pedant? I prefer the term "correct"
One of the most useful ones, now with all the scrutiny in the business world will be the translation from any kind of management speak/weaselease into english.
Corp officer: We are commited to stringent compliance with accounting rules and will not tolerate anything less than the pure truth.
Translation: We're covering our rears as fast as we can.
Or to steal one from Dilbert...
Management: Employees are our most valuable resource.
Translation: (nothing)
They have them in English-Russian and English-German at present, but apparently plan to add more languages all the time. Their unidirectional models ("UT-103") handle about eight languages currently.
For example,
becomes:
"All art is quite useless." -- Oscar Wilde
From what I have seen (spoken to?) speech recog pretty much sucks now days, unless you are one of the lucky ones to have one of those 'special voices' that computer speech recog likes. . . .
/trained/. . . .) these sorts of applications of technology are going to be very limited in scope.
As I have stated before in these types of articles, until speech recog can get over 95% or so recog on untrained voices, (or heck, I would like it if it could get 90% recog on my voice
Need help treating your acne? Come here!
"I am looking for the tobacconist."
"I need some matches."
"How much do I own you?"
The entire dictionary can be found here.
Your reality is lies and balderdash and I'm delighted to say that I have no grasp of it whatsoever. - Baron Munchausen
Do you remember what the Guide says? I quote: "Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation".
Do we really want Pierre Parisian to be able communicate his exact feelings to Lin Chinese?
And this time, I think I spelled his name right, dammit.
The Mongrel Dogs Who Teach
Two exist already...grand juries and impeachment hearings. Obviously, they don't work very well yet. Too much politician involvement in the translation matrix.
So long as this stuff stays on the recieving end, this is all a step in the right direction. You don't want to deprive the people you're sending information to of any information. Let them decide whether to use a human or a computer. Sending a computer based translation that you can't understand only increases the chance of offending someone/misrepresenting something.
Giving it to soldiers in the field so they can "speak" the foreign language is bad. Instead give one-way devices to both sides and let them use those to translate what's told to them. That way if they need a human translator to clarify that's still an option.
It would be terrible if information started flowing between countries that had been passed through a computer translator first. Please, let me use babelfish to translate that spanish document, don't use it for me (heck, I have friends from south america who can help me clarify it if I need to but that's *no good* without the original spanish)...
Translation through tounges is a lossy process. Not translating it at least prevents compromising the information. It's all still there...just a wee bit harder to get at.
Brian
Better yet, how about one that let normal folks talk to politicians and be understood?
Right...
Dialect output. Soon, you won't have to listen to some Croatian nun discussing free will translated into Bostonian English, you'll be able to listen to a Croatian nun discussing free will translated into Jive.
This reminds me of a story told to me long ago by a friend of the family. She was of Dutch descent, and the story is about a well bred Englishman who went on a working holiday to Holland. He got work on the docks, and that is where he learned to speak Dutch. The result was that in a refined English accent he spoke obscenity-laden gutter Dutch, apparently unaware that he was doing so.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
The prob I have with a dictionary translation is the whole is often greater than the sum of its parts. For instance, how do you say "How are you"? Que Tal? What's up? How's it hanging? Come stai? A dictionary can, literally, translate any of these into any language imagineable, but would the listener understand?
I'd rather you do it wrong, than for me to have to do it at all.
Your comment reminds me of one of the all time funniest moments from that classic of american TV, Married with Children. The scene: Al Bundy is at the DMV. He asks (in english) to take the written exam. The clerk asks him what language.
Al: I speak the language that everyone in this country speaks
Clerk: Ah, spanish it is
Al: No. This is america. I speak american.
Clerk: American, eh? (Looks in the file cabinet)
Aha, here it is. American. Wow, I hope you know a lot about trucking.
To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton
How are we /ever/ going to get it into a package that is small enough it fit in your ear and watertight enough to let swim around in a bowl of water when you're not using it?
I'm majoring in computer linguistics, and currently we're examining different computer translation models; the one you're suggesting is called the interlingua approach.
The idea is, basically, that you need an "in-betweener" language that can carry all the meaning and connotations of both source and target language. Then you only need translations rules for both sets and then let it run.
The main drawback is that you always have some loss in both translation steps, which sometimes adds up to quite a difference in meaning. The main advantage is that you can modularize - once you have a working English-to-Interlingua module, you can use Interlingua-to-French, Interlingua-to-German, what have you. For further information, google for interlingua "machine translation"...
-- Language is a virus from outer space.
... involving extremely cunning linguists creating *brand new* languages, completely from scratch, for corporate clients who need to communicate freely and yet still keep something relatively secure.
A per-transaction language, in other words, with a complete new lexicon for each speaker. Of course, the individuals would have to learn the language quite quickly, so this would also be another service realm in this plan.
Sort of like Kings of old, who used to use language differences to obfuscate and control various parts of court, only in this case it would be a commercial service, and available to all.
Something like this would be a good tool in the modern corporate environment, I think.
Well, I'm off to register Babylon, Inc...
Oh, D'oh!
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
... if only everyone learned to speak Klingon.
True warriors use the Klingon Google
Hell if we need ta hear from 'em we'll jus kick thier asses and make 'em learn ta talk American instead of all that gibberish!
Quemadmodum gladius neminem occidit, occidentis telum est
But can it beat Kramnik in chess? Ah, now *there* is the question!
www.HearMySoulSpeak.com
I love it when somebody invents something that isn't new.
The first thing you learn about in psycholinguistics is the concept of the pidgin -- a common language which develops between two or more peoples who must interact but share no lingua franca. These simple languages, which sound like baby talk bastardizations of both languages, eventually turn into what's called a creole, such as that sexy patois spoken by fortune tellers on cable.
All these chaps have done is built their own version, and as the case of esperanto shows, manufactured language is very difficult to gain acceptance and adoption of. They'd have been better off locking a Croat and a Brit in a large office building with big gulps and no marked bathrooms. These guys would develop a pidgin pretty quick.
Hey freaks: now you're ju
Nobody cares if it doesnt turn out to be as good as a human translator, because not everyone can afford to retain a translator on staff. Or a decent butler for that matter.
Secondly, it matters not a jot if the creators are multilingual, since the problem is not that you don't know 'many' languages, but that you and one other person don't know a language in common. doh!
I'd rather have one for lawyers. I don't know anyone that can speak legalease.
I will know this tech is mature when they are able to translate a legal document into English.
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
I'm British, and speak only poor schoolboy French. However since hooking up with my half-Russian, half-Serbian girlfriend, I've found that by learning a dozen or so basic words and phrases by rote, then trying to use them conversationally, I've been able to pick up a surprising amount. Serbo-croat was always supposed to be a nightmare to learn, but it's waaay easier than English... for instance, a pnoneme(?) a group of three or four letters will always be pronounced the same way (cf eg "ain" in English.) I'm rather hoping the Babel Fish is never released; by learning the language you start to subconsciously pick up something of the target language's cognitive assumptions, and (in a small way) to "think like" a native speaker. Now /Russian/, there's a tricky language... but we
both play chess which is a good middle-man ;)
Languages have very precise definitions, and it is possible to make programs that translate any language into logic, see aristotle for an example.
No, languages do not have very precise definitions. Take this from a published translator: they do not. The definitions in the dictionary are at best approximations to a particular range of any given word's semantic field; precision with human languages is impossible. Read up on some linguistics before you start posting things about linguistics.
Actually, (believe it or not, I didn't read the story), CMU's had a system that sounds exactly like this (speech->computer metarepresentation->speech) that gets 95% accuracy.
Plus, research speech recognition is well ahead of most consumer-available speech recognition...but also requires custom hardware or more resources.
May we never see th
The real computer science work was in developing a metalanguage represenation and a method of mapping the language to that metalanguage and back. Filling in the actual mappings is something that you can hand off to translators or native speakers.
May we never see th
Well, probably because CMU has done a lot of speech recognition stuff that was used in this. The translation table stuff is just dumped on top -- happens to be the latest work. You talk about how festival is crucial -- Alan Black, of festival fame works at CMU. You mentioned sphinx, I believe.
Don't knock CMU -- they're an international leader in this area.
May we never see th
I guess I ought to mention that I have a project
on SourceForge called Linguaphile. It handles
about 50 languages currently but only about 4 of
them are remotely useful. The Spanish and
Swedish are probably worth playing with. It's
early days and needs lots of work but it does
actually do something now. I'm really interested
in finding people who would like to work on it.
You can try it online or download it if you have
Perl. Apologies in advance that there are no
docs at all since I've had little interest:
Linguaphile online