PDA Speech Translator
jlowery writes "Not quite as good as a babelfish, but a PDA that does translation is probably better than resorting to hand gestures alone. I could see this as a boon to the tourist who travels to places where English speakers are uncommon."
The problem with every software that I have used that tries to decipher human language (like Zork or the game included with emacs for X) is that you have to know what words the software understands and in what context.
I have seen the same problems with automated phone systems that are supposed to recognize a generic voice and I can see the same thing happening here.
The main difference here though, is that when entering text, you know exactly what you input before pressing enter. With voice recognition software, how do you know that the software "hears" exactly what you say? If you say somethign like "What are my appointments for the thirteenth?" and it hears, "What are my appointments for the thirtieth?" you would be receiving the wrong information.
I hope this is a success but I don't have my hopes up.
--
7329756
The linux hacker
You know, I really admire your consistancy.
"All your base are belong to us!"
Spoken like someone who has never taken a foreign language class. Suppose that thing is going to get the accent right? Emphasis on the right syllable? Not likely, mostly good for translating some text message into the PDA holder's tongue (and doing an Engrish job of it anyway.)
A feeling of having made the same mistake before: Deja Foobar
I thought that you only had to speak English slowly and loudly enough for anyone to understand. Silly me!
According to the article, it only works for medical terms so far, and is only 80% accurate. I don't know about the rest of you, but I don't think I'd want to trust any of my medical treatment to such a translation!
Doctor: "Well, we thought he said pennicillin, not omoxycillin! I'm afraid the infection has run amok!"
Yeah, I could really use one of these when I go from Fort Lauderdale to Miami...
How come Slashdot never gets Slashdotted?
"It also works only when the speakers are talking about medical information, and it's only about 80 percent accurate in the lab."
Forgive my immediate misgivings, and you can call me chicken if you want, but I'm really not that keen on walking into a hospital and asking to have a medical procedure done with a 1 in 5 chance that instead of removing my appendix, they might remove my "appendage"...
"It is the prerogative of fools (or noobs) to utter truths that no one else will speak."
Now i can travel to other parts of the USA and be able to understand the locals!
English -> French -> German -> English:
Not necessarily also well as babelfish, but a PDA, which makes the translation, is probably better than falling back, in order to give only gestures. She could see this as the favour the tourist, who travels to the places, where the persons of the English speech are a little frequent.
...Stephen Hawking in Arabic.
The prototypes only said "All your base are belong to us" in Iraqi.
For when text-books are too cumbersome in the field. I thought these were being used to some degree by the military already.
Like the books they are not intrinsically intelligent.
Technology is at a point where all the software has been written to create a translator where a person speaks into a microphone which then is translated into text which is then translated into a different language which is then played back verbally in the same persons voice in a different language. The problem is that this cannot be done in realtime. 4 years ago I worked on a project for At&t to create an application that would train a users voice, break down thier voice patterns and be able to rearange those patterns to create other sounds which sound like they are coming from that real person. The problem is that with current processors the time to train and process is about 10 hours. So we can do voice recognition in realtime, we can translate text words in realtime, and in 10 hours we can reproduce a persons voice nearly flawlessly. Think of the possiblities!
There is or can be built a machine that can simulate any physical object. -Church-Turing principle
I realize that this software is supposed to be somewhat more powerful, but what I am saying is that even limited translation programs are useful for tourists.
As speech recognition technology gets better, and as handheld computers get more powerful, audio translators are becoming a more practical proposition.
Researchers from Carnegie Mellon University, Cepstral, LLC, Multimodal Technologies Inc. and Mobile Technologies Inc. have put together a two-way speech-to-speech system that translates medical information from Arabic to English and English to Arabic and runs on an iPaq handheld computer.
The prototype falls short of Star Trek's fictional universal translator in several ways. The system is not transparent -- it must be switched between Arabic-to-English and English-to-Arabic modes. It also works only when the speakers are talking about medical information, and it's only about 80 percent accurate in the lab.
The device shows that it's becoming possible, however, to provide automatic translation using a portable device. "It's good enough to make yourself understood," said Alex Waibel, a professor of computer science at Carnegie Mellon University and a founder of Mobile Technologies Inc.
The effort is one of a series of projects aimed at providing the armed forces with automatic translation for medical and force protection situations and making automatic translation in a wider set of subject areas available for tourists during the 2008 Olympics in Beijing, said Waibel.
The Speechalator prototype uses a built-in microphone and a language-selection button. "You push on the button on the iPaq and speak a sentence and then the translation comes out... in the other language," said Waibel. "You can switch it into the opposite mode when the other person answers and it translates back into your own language."
The software consists of three components: a speech recognizer, a translator, and a speech synthesis engine. "Each one of these components have slight twists to them... in order to work properly for speech translation," said Waibel.
The researchers modified the speech recognition engine to optimize it for handling spontaneous speech.
The translation system has the biggest twist. It extracts the key meaning from the input sentence and translates it to an interlingual, or intermediate representation, and the process depends on the speech being contained in a certain domain, or context, like medical information. "It's just certain nuggets in the phrase that... you need to extract," said Waibel.
The process is akin to constructing a medical-context template that fits the key information, then filling in the template, said Waibel. This process makes it possible for the system to handle spontaneous speech. "We go fishing for the nuggets," he said. But it is also a limitation -- the system must know what domain a speaker is talking about.
The researchers are working on a system that can handle multiple contexts and automatically switch between them, said Waibel. "It can, for example, recognize 'now you're in the hotel reservation domain', or 'now you're in the conference registration mode', or 'now you're talking about medical problem'," he said.
To come up with templates that handle different domains, the researchers collect a lot of data from people talking in those domains, said Waibel. "The more data we collect the better coverage of all the possible ways you could be saying [these things] becomes," he said.
The difficult part was fitting the software required to do two-way translation in the 64 megabytes of memory contained in the handheld computer, said Waibel. "You need two recognizers, two synthesizers and two translators to make [it] happen in both directions," he said.
The prototype also has a camera attachment that translates text like that on street signs, said Waibel. Snap a picture of a sign with the camera and it automatically extracts the text region, puts the text through a character recognition program, then translates it, he said. "What you then see on the screen is the picture of the scene with a sign and then underneath an English subtitle," he said.
"Are you speaking the english?"
"I speak to the English, it's the Americans I won't talk to..."
-Adam
First can we have a PDA that does decent text-to-speech or speech-to-text, preferably both.
A hardware babelfish will revolutionise human communication later this century, but right now you need both of the above before you can begin to contemplate speech-to-speech. I can't imagine any serious algorithm at this time would attempt direct translation, without an intermediate text translation phase.
Bit OT: Considering the interest in E-Books, I don't know why music players and PDAs force users to download wave forms when we could just download text and convert using a cheap text-to-speech synth.
Outstanding. This thing will finally make the common Ugly American practice of yelling actually useful:
*hold PDA to face* Ahem! "WHERE IS THE BATHROOM?!" *hold PDA to foreigner's ear*
RW
...but since that's way too obvious, I'll leave it to the casual slashdotter to fill in the joke.
Let's face it, language butchery is funny. To do so automatically is so much more amusing! I mean I installed festival on my machine just so I could hear the synth voice say stuff like "beeeeyotch" and "retaaard" -- imagine how well you could offend in different dialects!
I suppose it does have legitimate uses...but what fun is that? Then again with the quality of translation software nowadays, it should be amusing nonetheless. If nothing else, maybe we can use it to come up with retranslated English to use as virus subjects. Maybe we could come up with gems better than "I send this to ask opinion for you. Don't show anyone!"
If it can translate 'All Your Bases Are Belong To Us' correclty?
Use my link above, or to view my server, NeoThermic.com
You suck so much at getting FP that you can't even get First FAILED IT!!!
If it talks in and out, and uses an ear bud, it would be like being able to speak the language, albeit with a terrible accent, and occasionally offending the prime minister! That would be cool.
stuff |
I do not know the location of the hovercraft of which you speak, but the eels sound potentially appetizing.
The fun thing about travel is trying to undertstand people and them trying to understand you. Most people want to learn a little English and many Americans want to learn anything else (other than Spanish of course, which Mexicans have made them think is a peasant language). Anyway, with a machine I think it would be awkward, and it may make the local person feel a little inadequate (I got a PDA, you don't).
Text on screen: In 2004, the World Trade Center lay in ruins, and foreign nationalists frequented the streets - many of them Arabs (not the streets - the foreign nationals). Anyway, many of these Arabs went into tobacconist's shops to buy cigarettes....
A Arab tourist approaches the shopclerk. The tourist is talking haltingly into a PDA.
Arab: I will not buy this record, it is scratched.
Clerk: Sorry?
Arab: I will not buy this record, it is scratched.
Clerk: Uh, no, no, no. This is a tobacconist's.
Arab: Ah! I will not buy this *tobacconist's*, it is scratched.
Clerk: No, no, no, no. Tobacco...um...cigarettes (holds up a pack).
Arab: Ya! See-gar-ets! Ya! Uh...My hovercraft is full of eels.
Clerk: Sorry?
Arab: My hovercraft (pantomimes puffing a cigarette)...is full of eels (pretends to strike a match).
Clerk: Ahh, matches!
Arab: Ya! Ya! Ya! Ya! Do you waaaaant...do you waaaaaant...to come back to my place, bouncy bouncy?
Clerk: Here, I don't think you're using that thing right.
Arab: You great poof.
Clerk: That'll be six and six, please.
Arab: If I said you had a beautiful body, would you hold it against me? I...I am no longer infected.
Clerk: Uh, may I, uh...(takes PDA, talks to it)...Costs six and six...ah, here we are. (speaks weird Arabic-sounding words)
Arab punches the clerk.
Meanwhile, a cop on a quiet street cups his ear as if hearing a cry of distress. He sprints for many blocks and finally enters the tobacconist's.
Cop: What's up
Arab: Ah. You have beautiful thighs.
Cop: (looks down at himself) WHAT?!?
Clerk: He hit me!
Arab: Drop your panties, Sir William; I cannot wait 'til lunchtime. (points at clerk)
Cop: RIGHT!!! (drags Arab away by the arm)
Arab: (indignantly) My nipples explode with delight!
I believe PDAs are going to be tremendously transformed over the next few years.
1. Convergence is going to happen with a vengance. The Treo 600 is just the start. More and more apps will make it to the PDA. Speech recognition is one, and that sets up for another dybamic...
PDAs don't really need screens and keyboards if you can talk to them and they can talk to you. If they don't need those components, they can get a whole lot smaller. The next generation PDAs will be like a hearing aid, and the ones after that will be built into your glasses or an implant. That means less power, so less battery. Besides, it will be able to run on your body heat if not tap into your own body's electrical system, so it won't need a battery. Every improvemnt along these lines dwindles the size even more. A heads-up display, made transparent or opaque, ought to handle those times when you need to really observe rather than consult.
A combination of AI and connectivity will mean your PDA is your first line of defense in many of life's situations. Get pulled over by a cop and it will tell you what to do, what NOT to do, and contact your lawyer. Need a cop and it will call them and know just how long it's going to take to get there.
Medicine: It will have a complete medical history of you, remind you to take your meds, and monitor your blood pressure and other vita signs. If you have a heart attack it will call 911 with your location and be the first thing the medics consult when they get to you.
Personality: You'll be able to choose its level of humor and sarcasm. Although clearly a machine, people will develop meaningful relationships with them, at least they'll think so.
Connectivity: Everything you can think of, including your own house, which you'll call up to turn the heat up since you're coming home early. All teh Wi-Fi/cell connectivity you want will be built in.
Finances: It will know everything you do and provide access to your dough. If you get overdrawn it will be intentional because it will have real time access. It will have all the ATM/debit/credit stuff all on-hand. It will also be able to shop for you and tell you where the best deal is.
It will know all your friends and business associates and help remind you, "This is Joe. He's a Cougar. He knows you're a Husky, but don't rub it in. His kid just joined the Navy. He thinks LOTR sucks, and Rush is Right, so be careful. He drinks Guiness. His budget is 250K and he's looking to upgrade the Ciscos."
You'd never think of leaving home without this. Indeed, since it very well may be built-in, you won't have to worry about it. Just keep up the subscription.
'
How about a moderation of -1 pedantic.
...with a small supercomputer stored in an over-sized novelty hat. You could only wear it for 10 minutes at a time before suffering permanent neck damage.
my experience with voice recognition (yes even your beloved Via-Voice) is that it blows and will for some time. We probably need better speech recognition before we get speech to speech.
Instead of having all sorts of gadgets with you when traveling, plus all sorts of voltage adapters/convertors, here are some words that I think work fair good around the world.
For food:
For water:
For car:
For a phone call:
If you need to send a message forget the bottle use the word FedEx.
After all is not that bad living in a world which brands everything but the air we breath :).
Never forget the useful SOS and the word NO. Is also helpful to learn the local word for please, potty, taxi and thanks.
Now all this words are for emergency situations else get a copy of the local travel guide from Mc Nally or any other source and you are more than set to go.
My final word is that we should'nt be umbilical cord attached to technologies left and right. Regards.
He's being demeaning to you moderators and questioning your judgement.
...for the United States to invade every country and impose the English language on each of them. We've been promised seamless voice recognition and translation for years and I don't see it happening anytime before Duke Nukem: Forever gets released.
History shows that no new technology really takes off until in becomes an effective distribution mechanism for porn...
Well. I have been to quite a few places where English was not exactly lingua franca. In most of these places semi-right pronounciation of foreign words would not have had a big impact. Hand gestures and my favourite dictionary (which contains pictures of just about anything one would ever need 'on the road') have always been sufficient to find a hotel, a train or bus ticket out and some food. For the latter: Just walking into a restaurant's kitchen and pointing at the visible ingredients (dead or alive ;) ) suffices, and can generate a lot of fun in the process :)
Not being content with translating humanoid speech, the Japanese have aimed their sights higher; dogs and cats. Cheaper than a PDA too, but they still need to work on the size and texture so it slithers nicely into the ear.
I could see this as a boon to the tourist who travels to places where English speakers are uncommon. You mean like those horribly backward places that consider "aluminium" to be a 3-syllable word, and think "getting pissed" has something to do with being angry?
"Freedom means freedom for everybody" -- Dick Cheney
Just don't forget your Protocol Droid
keep in mind that pdas are even more uncommon in those places, so i wouln't want to spend too much on such a device.
also keep in mind, that it actually is possible to learn a language, which does not happen to be the most widespread on this earth (or at least in those parts of this world you happen to travel to)...
Common, Lettuce divet a chalice and quite bean so whole of negativity!
Babelfish is terrible at even translating Germanic and Latin languages and this thing is supposed to be worse than that?
I know that people want to solve everything with technology, but is it so much more difficult to learn another language or perhaps even a few phrases of the country where you are going to. Why does one even go to another country if one doesn't want to understand even the smallest part of that place?
Silly foreigner, don't you know everyone speaks English?
Business \Busi"ness\, n.;
A scam in which all people involved perceive as beneficial...
Like Miami???
"Oh look honey! A local! I wonder what he wants. Use your iPaq to find out what he's saying!"
"Umm... He says 'Give me your iPaq or I will be forced to kill you and take your wife back to my yurt.'"
So all you need is a mobile phone. You phone up the number for the language you need translated to, tell the translator what you want to say and hand the phone over to the person you want to talk to. Quite expensive per minute, but cheaper than a PDA and very very handy in an emergency.
Course, you could learn another language, it isn't remotely as difficult as school makes it out to be. English is one of the more difficult languages to learn. If you learn, one of Italian, French, Spanish, Portugese you should be able to pick the others up fairly quickly. English is based on a Germanic language with a lot of the French and Roman influences chucked in on top, it's a real mishmash.
Government of the people, by corporate executives, for corporate profits.
I want to talk to my mare!
They should start a new reality show where Americans try to survive in various countries with only this device to translate for them. "How is your wife this evening" turns into "Where may I find a lady of the evening".
I Am My Own Worst Enemy
These things aren't going to get any better until handhelds get the cache sizes necessary to run a HMM search in reasonable time.
Can you give som tips/hints? I don't want to FAIL IT again :/
insightful!
"PDAs don't really need screens and keyboards if you can talk to them and they can talk to you..."
Yes and no, depending on what people are using them for. Originally, the same was said about computers -- that the keyboard/mouse would become useless once voice-recognition became reality -- but people quickly discovered that even when the technology worked wonderfully, they didn't really want to be stuck *saying* everything. There's also a larger proportion, I believe, of "visual" people out there than there are "auditory" ones, plus a lot of people also hate the sound of their own voice, or don't like having others "listen in" on their plans, or have other objections to speech-based input.
On the other hand, I agree that *something* is going to supplant the stylus/keyboard combo. I'm just not so sure, especially after dealing with Sony's voice recognition tech support system recently, that it will be voice-based. (I have a Clie NX60 that I love which includes a fairly nice-sized built-in keyboard, voice recorder, and the usual PalmOS on-screen input options... I dislike the keyboard immensely, but between on-screen input and speech, on-screen works for me.)
Personally, I'm hoping for more handheld systems that just give users the option between a variety of well-done integrated input types, rather than the "let's ALL do it way X" mentality that a lot of people seem to champion. That way those of us that prefer or even need one type over another can use that... My concern is always that "one type" will win out, excluding anybody that doesn't have a brain wired for that form of interaction, so those people have to use second-rate "adaptive equipment."
With mobile translation devices, and even better translation servers for ubiquitous mobile phones, Europe's great disadvantage will now recede. The United States of America is possible due to shared language. With translation, the free travel and commerce in Europe will be bolstered by free speech across borders. Incidentally making the mobile phone as central to 21st Century European culture as the TV was to 20th Century American culture. I'd rather have the phones as my totem.
--
make install -not war
People learn foreign languages when we must, rarely for curiosity, and almost never out of "respect". Americans can get what we want by asking in English. When we sometimes can't, we learn the language, like I finally did after years of lifeless school Spanish was finally revived in my adventures in San Francisco's largely unilingual Meximerican Mission district. Or my bare competence in "French" after a month in West Africa. Or my drinking survival skills in "Y'at" after living in New Orleans for years.
Nonamericans learn English when the must, out of economic or entertainment (or other cultural) necessity. Likewise have we always learned other foreign languages than English. The 20th Century was structured with a vast population of hundreds of millions of Americans who had what so many others wanted, including specialists fluent in foreign languages, acting as agents for unilingual Americans. As the world becomes more decentralized in communications, with more opportunities (and necessities) for Americans to speak foreign languages, we'll learn more. With America's diversity of origins, America will likely have the most foreign speakers. It already might, even if they're so overwhelmed by the more numerous unilingual that it's hard to notice. The tendency for non/English speakers to learn English is also amplified by the ample opportunites to do so, from the immersive, ubiquitous American media, to the acceptance of Americans of foreign mangling of English as we all mutate it into American. There's nothing especially insular about American culture, compared to others, that will prevent the growing multilingualism, and translation devices will help us all learn more about each other.
--
make install -not war
Just a personal assessment here. My feel is that while text translation has moved forward some, the use of voice translators is a totally different story. Look at it this way, the text translators have a hard time coping with the non-perfect manner that language is written, what more for voice translators where you have to cope with the nuances and tonal differences in even a single language e.g. different ways in which English is spoken in America and the rest of the world? The other thing which is tricky about language translation is coping with the idiosyncracies of a given profession e.g. medical field by the CM chaps. Certain terms or phrases just don't have an equivalent in the general language.
--- root@127.0.0.1
How about the opposite sex? Parents? Now those would be Nobel-prize-worthy accomplishments.
"This is not a sig." -- R.
There are places where English speakers are uncommon?
I doubt any translation gadget will be more useful in any context than a good old-fashioned dictionary. The reason is: when you want to communicate something, words are often the smallest part of the message. Would people listen to you if you shoved a tape player in their face, even if the pre-recorded message was in their language? I wouldn't! But if you can speak just a few words of a foriegn language, and are willing to try to use them (plus gestures) face-to-face, the response is a lot more likely to be positive. Many people in other countries now assume Americans won't bother to learn their language (with good reason, apparently), and showing that you are interested enough to try makes a good impression...
Then, (and only then), maybe you can pull out the PDA with some hope that the person will have the patience to wait while it translates the details...
Just my opinion, of course.
"I could see this as a boon to the tourist who travels to places where English speakers are uncommon."
A device that should be used for situations like medical emergencies only. *Not* for tourists, business, etc.
We should be enticing people to learn other languages (and more about the cultures of people that speak them) rather than making it easy for some societies to become even more ethnocentric.
~Me love you long time.
The irony is that a computer will only become cogniscent when it begins to fail like a human. Yet, as a society, we aren't ready for our machines to make mistakes. If I have problems understanding someone who speaks my native tongue with a "heavy" accent, how can a computer be expected to do better? Someday computers may do a better job, but it still won't be perfect. That's the problem with statistics. Any statistically based mathematical model will never provide absolute predictability (take a look at Quantum Mechanics).
What do you mean my sig is repetitive? What do you mean my sig is repetitive? What do you mean....
I wish as children that we would be tought 3 -5 languages in grade school.
Children of that age can easily pick up multiple languages.
It is sooo hard to learn foreign languages by the time you are a teeager and worse yet, our education system has mostly given up on teaching grammatical constructs more complicated than subject predicate until middle/high school.
I know more latin grammer than I do english because they actually teach it in latin class.
And once you get assusomted ot learning languages, other come to you more easily.
If it were up to me Everyone in american would learn English, Latin, German atleast, and possibly greek. Then we would know all the roots where out words come from. Once you know latin, the other romance languages become relativly easy.
Give it a few hundred years and there wont be an english, french, spanish, german. There will be just one language when people absorb the best constructs of each.
I's afraid your prediction will only apply to a wealthier minority. Particularly for Medicine: Until we get a Nationalized health system, there is virtually no way Standards would be put in place for digital exchange of medical information with PDAs. Maybe specific medical devices for specific conditions, but not PDAs. Perhaps wealthy baby boomers will receive benefits of the new PDA tech at the expense of everyone else. Most people will not be able to afford the hardware, software, or the training to use advanced PDA technology for quite a long time.