Google Shooting For Smartphone Universal Translator
nikki4 writes to tell us that in giving some major improvement tweaks to its existing voice recognition tool for the Smartphone, Google is aiming for new translator software that will provide instant translation of foreign languages. "The company has already created an automatic system for translating text on computers, which is being honed by scanning millions of multi-lingual websites and documents. So far it covers 52 languages, adding Haitian Creole last week. Google also has a voice recognition system that enables phone users to conduct web searches by speaking commands into their phones rather than typing them in. Now it is working on combining the two technologies to produce software capable of understanding a caller’s voice and translating it into a synthetic equivalent in a foreign language."
Maybe my experience is atypical, but Google doesn't seem to translate pages very well. I can only imagine how bad it will be having a phone do this. "Did that guy's phone just call me what I think it did?"
Or you could just stick this Babel fish in your ear.
...this is a recipe for universal worldwide hilarity.
Caution: not for use with Hungarian Tobacconists.
Does anyone use voice recognition software? Here are a couple of my voicemails transcribed by Google Voice:
Hey man, Hello, this is gonna ask you about Stockton uncle in a missed your call, so, so give well. Okay bye.
Hey it's me and I for me. Long, My of the day. So Hey Jared, Here doing. If you come for another anti, gimme a call before you go to sleep and stuff, so give me a favor you familiar with it. I love you bye.
Whale
Google is in the business of collecting data and applying it to practical problems. I imagine the voice-to-text will be vastly improved over its generations by users accepting/rejecting the vtt result and them pooling the results data. The same thing could be done for translation from one language to another.
I see it as crowdsourcing the algorithm accuracy checks among millions of people, allowing them to improve the algo at a much faster rate than they (or their competitors) would otherwise be able to do in a closed testing environment.
This is all speculating on the fact that google pools results of translations or VTT and whether the user accepts/declines them. I wouldn't be surprised in the slightest if they did.
Tablespoons, by an Apple Newton
or [allegedly] what happens when you run Jabberwocky through a handwriting recognition program.... :-)
-----------
Teas Willis, and the sticky tours
Did gym and Gibbs in the wake.
All mimes were the borrowers,
And the moderate Belgrade.
'Beware the tablespoon my son,
The jaws that bite, the Claus that catch.
Beware the Subjects bird, and shred
The serious Bandwidth!'
He took his Verbal sword in hand:
Long time the monitors fog he sought,
So rested he by the Tumbled tree,
Long time the monitors fog he sought,
And as in selfish thought he stood,
The tablespoon, with eyes of Flame,
Came stifling through the trigger wood,
And troubled as it came!
One, two! One, two! And through and through,
The Verbal blade went thicker shade.
He left it dead, and with its head,
He went gambling back.
'And host Thai slash the tablespoon?
Come to my arms my bearish boy.
Oh various day! Cartoon! Cathay!'
He charted in his joy.
Teas Willis, and the sticky tours
Did gym and Gibbs in the wake.
All mimes were the borrowers,
And the moderate Belgrade.
"Shaka, when the walls fell"
"Gorbachev Sings! Tractor! Buttocks!"
Why, without your clothes, you're naked, Miss Dudley!
The problem is primarily things like diction. You can "train" someone sitting in front of a computer to speak slowly and clearly with good diction. Fine.
The problem is the most useful use model for a cell phone translator would be getting a cab or walking into a store. You talk into your phone and it says something to the other person in their language - wonderful, because you have "trained" yourself to speak clearly and slowly with good diction.
Then the other person mumbles something back at you in their language that neither you or the cell phone can make heads or tails out of. You can't "train" them so it will never work for that.
From my limited experience, English has its share of strange accents and such but in large measure people can speak with good diction and pronounciation. Lots of non-English languages seem to promote far less clarity and human-to-human it doesn't really impair communication that much. Human-to-machine is a whole different story and we are very far away from being able to do speech recognition with poor pronounciation and poor diction.
Stephen Fry offers...
"Hi, Stephen, it’s Natasha from BBC Newsnight in London. Just to say I’ve sent you two texts. One is to say that we could do it at eleven am your time after the launch, or any time sooner after the launch, or we could do it at midday as we suggested earlier. I, er, if you could text me back about that, and I’ve sent you the details of Skype that you need to do too. If you could give me a call back. Enjoy the launch and I’ll speak to you after that. Thank you Bye."
I’ve transcribed it from the voicemail sound file that resides online on my inbox on the Google Voice site. All fine. I have also ticked the option for Google Voice to send me a text transcript of any voicemail. Below is their interpretation of Natasha’s message it’s rather endearing how hopelessly wrong the largest company on earth gets it.
"Hi Stephen. It’s Jeff from BBC needs in nuns. And just to say I sent 80 tax, one, if to say we could do it. I left in i a m your time off to go into any time soon, or the court and full we could grab me today as we suggested at. A. F. I. If you could text me back byebye. I’ve sent you the details of skylights that you need to 3 T if you could give me a call. Bye. Enjoy the loans. I’ll speak to you after that. Thank you. Bye"
On a more serious note, such transcripts at least allow you to get an idea of the rough content and tone of a message without having to stop and listen to it, a much more concentration-intensive task.
No kidding!!! What do you say at this point?
This seems like something that the NSA is probably salivating over. Imagine being able to translate intercepts in near real time with accurate voice recognition. I'm sure they already have imagined it. That technology is nothing short of a Manhattan Project for the SIGINT community.
A better example would be say Dutch. Translate the OP from English to Dutch and back to English (i.e. a worst case scenario), and you end up with this:
"The company has an automatic system for translating texts on computers, sweetened by scanning millions of multilingual websites and documents. Until now includes 52 languages, adding Haitian Creole last week. Google has a system telephone speech recognition that allows users to query websites by speaking commands into their phones instead of typing them in. Now it is working on combining the two technologies to software to understand voice of a caller and translating it into a synthetic equivalent in a foreign language to produce. "
This is perfectly legible to me, and vastly better than what you got when babelfish was introduced 11 years ago. There is a good TechTalk about the topic at http://www.youtube.com/watch?v=y_PzPDRPwlA which should be required viewing before making fun of google's machine translation efforts.
Voice recognition is harder, but for continuous untrained speech recognition google voice is pretty cool - I've gotten some barely intelligible voice messages on my google voice number, and where google voice is sure (i.e. black text) it is 95%+ correct, where it is not sure it is maybe 30% correct, but for another 30% it is not possible to figure out what was said, except when taking context into consideration. Google Voice transcribing a call from a mobile phone is better than what you got with Dragon Dictate 5 years ago even with a good microphone, so it is not unlikely that in a few years it will be better than naive human transcription. Humans will be better at guessing based on context thought.
Basically, in 5 years the kind of system google is talking about will work good enough to successfully flirt with a french girl (see http://www.youtube.com/user/searchstories) :P
[*] This is why you should always bring a mobile phone, and have the number for the place you're going.
This would be great if all calls are translated and spoken by a sexy female voice!
The purpose of writing is to inflate weak ideas, obscure poor reasoning, and inhibit clarity....Calvin