Google Is Taking Spoken Questions
The New York Times is reporting that Google has added a voice interface to their iPhone search software. Expected to make its debut as early as Friday, users will be able to speak into their phone and ask any question they could type into Google's search engine. The audio will be digitized and results will be returned via the normal search interface. "Google is by no means the only company working toward more advanced speech recognition capabilities. So-called voice response technology is now routinely used in telephone answering systems and in other consumer services and products. These systems, however, often have trouble with the complexities of free-form language and usually offer only a limited range of responses to queries."
If I could get good voice transcription on my computer by installing Google Desktop, THAT would make it worthwhile.
Something for iPhone users? I could care less.
Electronics are on a 50+ year run of continuously shrinking, but you can't do much more shrinking with our current user interface schemes. You could make an iPhone as thin as a credit card, but that's only a few "doublings" away from its current state, and then what? Making it smaller in either of the other two dimensions is just going to decrease its functionality.
But eyeglasses can have a heads up display. And in the magical world of tomorrow, maybe contact lenses can have a wireless interface and a high-resolution superimposed display. And of course there's wireless headphones. So those are potential conduits of information from machines to us, but how do we talk to the machines? If we're walking around, how do we dial the phone or ask for directions or tell the computer what YouTube clip we want to watch, if the heart of the thing can be the size of a penny?
I think it's going to be verbal. Short of the development of neural interface implants or that sort of thing, I think verbal's going to become a primary interface for mobile electronics. I think the chip that stores everything and wirelessly talks to the outside world and "makes it go" can be anywhere- wristwatch, glasses, tie clip, belt buckle. Whatever. But the thing's going to have to listen to you, so it can understand when you say "Show me the closest three book stores. Do any of them have a physical copy of Into The Nano Era, Moore's Law Beyond Planar Silicon?" Or "play my running playlist on random," or that sort of thing. Not strong AI, but good voice recognition coupled with really dramatically improved ability to parse and interpret commands from speech. I'm sure we'd need a couple of buttons or a knob or slider or such somewhere for things like volume that you just don't want to do with voice. But it strikes me that most of what people do on their iPhones, except for playing games, could be done quite well via voice, and then you don't need to lug around and pull out some physical gadget and stare at its screen and peck at it with your fingers.
Can anyone tell me how to set my sig on Slashdot?
Representative, Representative, Representative ... Operator, Operator, Operator ... Help, Help, Help ... (hangup in disgust)
Hope google has better luck with this than others have.
Ron
Just curious, but why only support iPhone? Why not Nokia/WinMo/Blackberry - ie, the other 99% of cell phones out there with voice recognition capabilities? Why single out one phone?
I've been waiting for Google to come out with this.
This is the first step to true and accurate voice recognition and translation:
1) Google user speaks search string into phone.
2) Google gets it wrong, user corrects Google
3) Multiply by millions of searches daily with constant correction and feedback from users
4) Perfect voice rec, major profit
There will be a few issues with voice recognition to begin with but as it gets better and more people use the service and add to the database with their corrections and add to the pool of variable accents etc the accuracy will be perfected at an exponential rate.
A similar concept could apply to translations. Once voice recognition is perfected and becomes the primary search input of choice then more people will be able to use their phones as direct voice to voice translators. Obvious translation mistakes will become apparent through mass use. At every turn users could flag apparent mistranslations and through the help of the Google Borg accurate translations would evolve. Much the same way that Wikipedia pages tend to accuracy over time even with the input of a subset of "disruptive" users.
My 2 cents.
ogglelog