Slashdot Mirror


PDA Speech Translator

jlowery writes "Not quite as good as a babelfish, but a PDA that does translation is probably better than resorting to hand gestures alone. I could see this as a boon to the tourist who travels to places where English speakers are uncommon."

33 of 161 comments (clear)

  1. The problem with these things by the+man+with+the+pla · · Score: 3, Insightful

    The problem with every software that I have used that tries to decipher human language (like Zork or the game included with emacs for X) is that you have to know what words the software understands and in what context.

    I have seen the same problems with automated phone systems that are supposed to recognize a generic voice and I can see the same thing happening here.

    The main difference here though, is that when entering text, you know exactly what you input before pressing enter. With voice recognition software, how do you know that the software "hears" exactly what you say? If you say somethign like "What are my appointments for the thirteenth?" and it hears, "What are my appointments for the thirtieth?" you would be receiving the wrong information.

    I hope this is a success but I don't have my hopes up.

    --
    7329756

    --
    The linux hacker
    1. Re:The problem with these things by garcia · · Score: 2, Insightful

      My father has a 2004 Acura TL with Bluetooth cellphone stuff... He was trying to get it to dial a number. What a pain in the ass. It was seriously almost as distracting as hand entering the number. I believe he had to ask it to dial XXX-XXXX 5 or 6 times before it stopped adding in two random zeros.

      Until the machines can be 100% accurate without frustration they are next to useless.

    2. Re:The problem with these things by Angus+Prune · · Score: 3, Insightful

      It all boils down to confidence. I have to be confident that what I'm doing will work.
      I use a wireless keyboard but Im having to switch back because I find I have to check what I am typing because it doesn't always pick up every keypress
      Voice to text are only of limited use while you have to re-read and correct any mistakes.
      While this is only 80% accurate it can never be trusted. When this works at 95% it won't be trusted. I won't trust that this won't mistake Renal for Venal.

      While this is a great step foward I can't see it being trustworthy for 2006 and I still think the same problems still apply to this as have always applied.

    3. Re:The problem with these things by fastidious+edward · · Score: 3, Insightful

      It was seriously almost as distracting as hand entering the number.

      Are you being sarcastic? I can type a number on a numeric keypad much faster than I can say it. The 5-6 times much more than compensates for the time of getting the phone out of my pocket.

      Voice recognition is great, but tactile recognition is also great, as is body movement.

      Until the machines can be 100% accurate without frustration they are next to useless.

      I know I have trouble understanding someone with a heavy Southern-USA accent, like someone else may have trouble with a heavy Scottish accent (as firends have) or heavy London accent (as I can revert to), people are not perfect at understanding people, let alone machines understanding people.

      Voice regognition is not a great saviour and IMHO is years away, in the meantime I'm happy with a numeric keypad.

      --

      karma karma karma karma karma chameleon, you come and go, you come and go.
  2. Had to be said by Anonymous Coward · · Score: 5, Funny

    "All your base are belong to us!"

  3. Good Grief... by ackthpt · · Score: 3, Insightful
    I could see this as a boon to the tourist who travels to places where English speakers are uncommon."

    Spoken like someone who has never taken a foreign language class. Suppose that thing is going to get the accent right? Emphasis on the right syllable? Not likely, mostly good for translating some text message into the PDA holder's tongue (and doing an Engrish job of it anyway.)

    --

    A feeling of having made the same mistake before: Deja Foobar
    1. Re:Good Grief... by UrgleHoth · · Score: 2, Funny

      Reminds me of the joke:

      What do you call someone who speaks three languages? A polyglot.
      What do you call someone who speaks two languages? A bilingual.
      What do you call someone who speaks one language? An American.

      --

      Dogma - "let's just say we'd like to avoid any empirical entanglements."
    2. Re:Good Grief... by NanoGator · · Score: 2, Insightful

      "What do you call someone who speaks one language? An American."

      I know it's a joke, but it's a common complaint aboout Americans. Unfortunately, nobody seems to think about the United States' geography and why most of us are uni-lingual. To the North, we have Canada, which is mostly english speaking. To the south, we have Mexico, which is Spanish speaking, but there's not all that much travelling back and forth like there is with Canada. Worse, they're very accomodating down there, so there isn't a big huge need to speak Spanish. Go much further south than that, and you're spending a great deal of money to get on a flight to do this. (I should know, I've traveled to Brazil twice.)

      This is very different from Europe where you can drive across countries like we can drive across states here. Even if we were bilingual, there wouldn't be a huge screaming need to speak in other languages. It's hard to feel the need to speak other languages when you have to travel overseas to encounter somebody speaking that language.

      Sadly, this factor is never considered. Nope, it's assumed we're just stupid.

      --
      "Derp de derp."
    3. Re:Good Grief... by ackthpt · · Score: 2, Insightful
      I suppose it boils down to, whatever country you're in:

      1. Are you happy getting by?
      2. Are you interested in the challencge a language can bring?

      As you say 1 can lead to learning a second language. This can lead to 2. But we may never know.

      IMHO Americans not learning Spanish is damn insular and imperialistic, they are your neighbour, not your slave, so why not put in some effort and try rather than assuming they are accomodating?

      Allow me to be cynical here. People cowtow to the language of commerce. If a lot of german people with a lot of money are visiting your town, you can bet people are learning german to be accomodating. This has much to do with why japanese, chinese, germans, belgians, dutch, italians, indians (asian) and even french, learn the language. The question is, with the blossoming of China's economy, will people turn to learn the business language of China?

      --

      A feeling of having made the same mistake before: Deja Foobar
    4. Re:Good Grief... by kfg · · Score: 2, Insightful

      . . .it's assumed we're just stupid.

      No, not stupid, insular and parochial, an opinion which your own post supports.

      Bear in mind though, that it is the behaviour of Americans in other countries that has engendered this reputation, most of whom don't even bother to take the trouble to learn how to say "please" and "thank you" in the language of the nation they're in at the moment.

      My stepfather is in Mexico right now. He spends a minimum of three contiguous months a year there, a practice he has maintained for the past 30 years. One year he stayed there for nearly half of the year. He avoids the tourist places, staying in out of the way local cities and villages of the interior. He is not stupid man. He is a professional writer with a Masters in English from Harvard.

      He knows maybe a dozen words in Spanish.

      This is pure cultural arrogance.

      It is also typical of American behaviour.

      KFG

  4. And here by Anonymous Coward · · Score: 3, Funny

    I thought that you only had to speak English slowly and loudly enough for anyone to understand. Silly me!

  5. Hmmm by dreamchaser · · Score: 2, Insightful

    According to the article, it only works for medical terms so far, and is only 80% accurate. I don't know about the rest of you, but I don't think I'd want to trust any of my medical treatment to such a translation!

    Doctor: "Well, we thought he said pennicillin, not omoxycillin! I'm afraid the infection has run amok!"

    1. Re:Hmmm by Spam.B.gone · · Score: 2, Funny

      oh no.. he said 'I want a full bottle in front of me'...

  6. Good Idea... by avgjoe62 · · Score: 5, Funny
    > I could see this as a boon to the tourist who travels to places where English speakers are uncommon.

    Yeah, I could really use one of these when I go from Fort Lauderdale to Miami...

    --

    How come Slashdot never gets Slashdotted?

  7. Yeah, thanks, but I'll wait for a bit... by dejinshathe · · Score: 3, Funny

    "It also works only when the speakers are talking about medical information, and it's only about 80 percent accurate in the lab."

    Forgive my immediate misgivings, and you can call me chicken if you want, but I'm really not that keen on walking into a hospital and asking to have a medical procedure done with a 1 in 5 chance that instead of removing my appendix, they might remove my "appendage"...

    --


    "It is the prerogative of fools (or noobs) to utter truths that no one else will speak."
    1. Re:Yeah, thanks, but I'll wait for a bit... by shuz · · Score: 2, Insightful

      If your willing to not have your speech translated in realtime. Say your willing to wait 5 minutes or so a 95% or better return can be expected. The main reason why these translators aren't accurate a lot of the time is because the algorithm used can only make a limited ammount of passes on each word so that each word is translated in near realtime.

      --
      There is or can be built a machine that can simulate any physical object. -Church-Turing principle
  8. I've always wanted to sound like... by Anonymous Coward · · Score: 2, Funny

    ...Stephen Hawking in Arabic.

  9. a complete translator could be possible by shuz · · Score: 4, Interesting

    Technology is at a point where all the software has been written to create a translator where a person speaks into a microphone which then is translated into text which is then translated into a different language which is then played back verbally in the same persons voice in a different language. The problem is that this cannot be done in realtime. 4 years ago I worked on a project for At&t to create an application that would train a users voice, break down thier voice patterns and be able to rearange those patterns to create other sounds which sound like they are coming from that real person. The problem is that with current processors the time to train and process is about 10 hours. So we can do voice recognition in realtime, we can translate text words in realtime, and in 10 hours we can reproduce a persons voice nearly flawlessly. Think of the possiblities!

    --
    There is or can be built a machine that can simulate any physical object. -Church-Turing principle
  10. Could work, in a limited sense.. by iantri · · Score: 2, Informative
    There is a program that already exists for the Palm (unfortunately I do not remember the name) that allows you rudimentary communication with one who speaks a foreign language by translating common phrases, selected by tapping on the screen.

    I realize that this software is supposed to be somewhat more powerful, but what I am saying is that even limited translation programs are useful for tourists.

  11. text by Anonymous Coward · · Score: 2, Informative



    As speech recognition technology gets better, and as handheld computers get more powerful, audio translators are becoming a more practical proposition.

    Researchers from Carnegie Mellon University, Cepstral, LLC, Multimodal Technologies Inc. and Mobile Technologies Inc. have put together a two-way speech-to-speech system that translates medical information from Arabic to English and English to Arabic and runs on an iPaq handheld computer.

    The prototype falls short of Star Trek's fictional universal translator in several ways. The system is not transparent -- it must be switched between Arabic-to-English and English-to-Arabic modes. It also works only when the speakers are talking about medical information, and it's only about 80 percent accurate in the lab.

    The device shows that it's becoming possible, however, to provide automatic translation using a portable device. "It's good enough to make yourself understood," said Alex Waibel, a professor of computer science at Carnegie Mellon University and a founder of Mobile Technologies Inc.

    The effort is one of a series of projects aimed at providing the armed forces with automatic translation for medical and force protection situations and making automatic translation in a wider set of subject areas available for tourists during the 2008 Olympics in Beijing, said Waibel.

    The Speechalator prototype uses a built-in microphone and a language-selection button. "You push on the button on the iPaq and speak a sentence and then the translation comes out... in the other language," said Waibel. "You can switch it into the opposite mode when the other person answers and it translates back into your own language."

    The software consists of three components: a speech recognizer, a translator, and a speech synthesis engine. "Each one of these components have slight twists to them... in order to work properly for speech translation," said Waibel.

    The researchers modified the speech recognition engine to optimize it for handling spontaneous speech.

    The translation system has the biggest twist. It extracts the key meaning from the input sentence and translates it to an interlingual, or intermediate representation, and the process depends on the speech being contained in a certain domain, or context, like medical information. "It's just certain nuggets in the phrase that... you need to extract," said Waibel.

    The process is akin to constructing a medical-context template that fits the key information, then filling in the template, said Waibel. This process makes it possible for the system to handle spontaneous speech. "We go fishing for the nuggets," he said. But it is also a limitation -- the system must know what domain a speaker is talking about.

    The researchers are working on a system that can handle multiple contexts and automatically switch between them, said Waibel. "It can, for example, recognize 'now you're in the hotel reservation domain', or 'now you're in the conference registration mode', or 'now you're talking about medical problem'," he said.

    To come up with templates that handle different domains, the researchers collect a lot of data from people talking in those domains, said Waibel. "The more data we collect the better coverage of all the possible ways you could be saying [these things] becomes," he said.

    The difficult part was fitting the software required to do two-way translation in the 64 megabytes of memory contained in the handheld computer, said Waibel. "You need two recognizers, two synthesizers and two translators to make [it] happen in both directions," he said.

    The prototype also has a camera attachment that translates text like that on street signs, said Waibel. Snap a picture of a sign with the camera and it automatically extracts the text region, puts the text through a character recognition program, then translates it, he said. "What you then see on the screen is the picture of the scene with a sign and then underneath an English subtitle," he said.

  12. I can see it now... by stienman · · Score: 4, Funny

    "Are you speaking the english?"

    "I speak to the English, it's the Americans I won't talk to..."

    -Adam

  13. That's great, but ... by fastdecade · · Score: 2, Insightful

    First can we have a PDA that does decent text-to-speech or speech-to-text, preferably both.

    A hardware babelfish will revolutionise human communication later this century, but right now you need both of the above before you can begin to contemplate speech-to-speech. I can't imagine any serious algorithm at this time would attempt direct translation, without an intermediate text translation phase.

    Bit OT: Considering the interest in E-Books, I don't know why music players and PDAs force users to download wave forms when we could just download text and convert using a cheap text-to-speech synth.

    1. Re:That's great, but ... by cavebear42 · · Score: 2, Insightful

      Forget PDA, I would like any software that can do a decent Speech-to-text. Every year of so I try all the latest stuff. Every year I keep typing. It is more likly that the rest of the world will learn english than we will have an effective translator in real time.

  14. Yelling Helps by aredubya74 · · Score: 3, Funny

    Outstanding. This thing will finally make the common Ugly American practice of yelling actually useful:

    *hold PDA to face* Ahem! "WHERE IS THE BATHROOM?!" *hold PDA to foreigner's ear*

    --

    RW

  15. "My hovercraft is full of eels" by NZheretic · · Score: 5, Funny
    With apologies to the python crew...

    Text on screen: In 2004, the World Trade Center lay in ruins, and foreign nationalists frequented the streets - many of them Arabs (not the streets - the foreign nationals). Anyway, many of these Arabs went into tobacconist's shops to buy cigarettes....

    A Arab tourist approaches the shopclerk. The tourist is talking haltingly into a PDA.

    Arab: I will not buy this record, it is scratched.
    Clerk: Sorry?
    Arab: I will not buy this record, it is scratched.
    Clerk: Uh, no, no, no. This is a tobacconist's.
    Arab: Ah! I will not buy this *tobacconist's*, it is scratched.
    Clerk: No, no, no, no. Tobacco...um...cigarettes (holds up a pack).
    Arab: Ya! See-gar-ets! Ya! Uh...My hovercraft is full of eels.
    Clerk: Sorry?
    Arab: My hovercraft (pantomimes puffing a cigarette)...is full of eels (pretends to strike a match).
    Clerk: Ahh, matches!
    Arab: Ya! Ya! Ya! Ya! Do you waaaaant...do you waaaaaant...to come back to my place, bouncy bouncy?
    Clerk: Here, I don't think you're using that thing right.
    Arab: You great poof.
    Clerk: That'll be six and six, please.
    Arab: If I said you had a beautiful body, would you hold it against me? I...I am no longer infected.
    Clerk: Uh, may I, uh...(takes PDA, talks to it)...Costs six and six...ah, here we are. (speaks weird Arabic-sounding words)
    Arab punches the clerk.

    Meanwhile, a cop on a quiet street cups his ear as if hearing a cry of distress. He sprints for many blocks and finally enters the tobacconist's.

    Cop: What's up
    Arab: Ah. You have beautiful thighs.
    Cop: (looks down at himself) WHAT?!?
    Clerk: He hit me!
    Arab: Drop your panties, Sir William; I cannot wait 'til lunchtime. (points at clerk)
    Cop: RIGHT!!! (drags Arab away by the arm)
    Arab: (indignantly) My nipples explode with delight!

  16. Just wait ten years by mschuyler · · Score: 5, Interesting

    I believe PDAs are going to be tremendously transformed over the next few years.

    1. Convergence is going to happen with a vengance. The Treo 600 is just the start. More and more apps will make it to the PDA. Speech recognition is one, and that sets up for another dybamic...

    PDAs don't really need screens and keyboards if you can talk to them and they can talk to you. If they don't need those components, they can get a whole lot smaller. The next generation PDAs will be like a hearing aid, and the ones after that will be built into your glasses or an implant. That means less power, so less battery. Besides, it will be able to run on your body heat if not tap into your own body's electrical system, so it won't need a battery. Every improvemnt along these lines dwindles the size even more. A heads-up display, made transparent or opaque, ought to handle those times when you need to really observe rather than consult.

    A combination of AI and connectivity will mean your PDA is your first line of defense in many of life's situations. Get pulled over by a cop and it will tell you what to do, what NOT to do, and contact your lawyer. Need a cop and it will call them and know just how long it's going to take to get there.

    Medicine: It will have a complete medical history of you, remind you to take your meds, and monitor your blood pressure and other vita signs. If you have a heart attack it will call 911 with your location and be the first thing the medics consult when they get to you.

    Personality: You'll be able to choose its level of humor and sarcasm. Although clearly a machine, people will develop meaningful relationships with them, at least they'll think so.

    Connectivity: Everything you can think of, including your own house, which you'll call up to turn the heat up since you're coming home early. All teh Wi-Fi/cell connectivity you want will be built in.

    Finances: It will know everything you do and provide access to your dough. If you get overdrawn it will be intentional because it will have real time access. It will have all the ATM/debit/credit stuff all on-hand. It will also be able to shop for you and tell you where the best deal is.

    It will know all your friends and business associates and help remind you, "This is Joe. He's a Cougar. He knows you're a Husky, but don't rub it in. His kid just joined the Navy. He thinks LOTR sucks, and Rush is Right, so be careful. He drinks Guiness. His budget is 250K and he's looking to upgrade the Ciscos."

    You'd never think of leaving home without this. Indeed, since it very well may be built-in, you won't have to worry about it. Just keep up the subscription.
    '

    --
    How about a moderation of -1 pedantic.
  17. don't we need actual voice recog first by netsavior · · Score: 2, Insightful

    my experience with voice recognition (yes even your beloved Via-Voice) is that it blows and will for some time. We probably need better speech recognition before we get speech to speech.

  18. Travelling by elf-fire · · Score: 2, Informative

    Well. I have been to quite a few places where English was not exactly lingua franca. In most of these places semi-right pronounciation of foreign words would not have had a big impact. Hand gestures and my favourite dictionary (which contains pictures of just about anything one would ever need 'on the road') have always been sufficient to find a hotel, a train or bus ticket out and some food. For the latter: Just walking into a restaurant's kitchen and pointing at the visible ingredients (dead or alive ;) ) suffices, and can generate a lot of fun in the process :)

  19. Bah! by MoeMoe · · Score: 2, Funny

    Silly foreigner, don't you know everyone speaks English?

    --
    Business \Busi"ness\, n.;
    A scam in which all people involved perceive as beneficial...
  20. ...where English speakers are uncommon by JGag21 · · Score: 2, Funny

    Like Miami???

  21. There are telephone translation services. by Moderation+abuser · · Score: 2, Interesting

    So all you need is a mobile phone. You phone up the number for the language you need translated to, tell the translator what you want to say and hand the phone over to the person you want to talk to. Quite expensive per minute, but cheaper than a PDA and very very handy in an emergency.

    Course, you could learn another language, it isn't remotely as difficult as school makes it out to be. English is one of the more difficult languages to learn. If you learn, one of Italian, French, Spanish, Portugese you should be able to pick the others up fairly quickly. English is based on a Germanic language with a lot of the French and Roman influences chucked in on top, it's a real mishmash.

    --
    Government of the people, by corporate executives, for corporate profits.
    1. Re:There are telephone translation services. by belmolis · · Score: 3, Insightful

      Aptitude testing is useful, but two other major factors in the success of the US government language schools (there are actually four: The Defense Language Institute in Monterey, the Foreign Service Institute, the CIA Language School, and the NSA Language School) are time and focus. In most other situations, such as high-school or college, people studying a language study it a small fraction of the time. It's one of four or more courses. Class time is 3-5 hours per week. On a typical university schedule, that's a maximum of 130 hours a year in class. In contrast, in the government language schools, language study is the whole show. Students spend 8 hours a day or more on the language (not all in class). That comes to much more time devoted to the language, and there are fewer distractions.

  22. Will it work on politicians? by leery · · Score: 2, Funny

    How about the opposite sex? Parents? Now those would be Nobel-prize-worthy accomplishments.

    --
    "This is not a sig." -- R.