Sophisticated Voice Commands the Next Big Step For Smartphones, Says Woz
splitenz writes "Sophisticated voice commands will be the next big step for the development of smartphones, according to Apple co-founder Steve Wozniak. 'We have gotten down to such tiny devices with amazing computers inside with all the human senses; vision, hearing, touch, location acceleration and movement. I don't want to click buttons anymore and I just want to do things without having to think about which buttons to click.' He was speaking at the Australian Chamber Business Congress. Wozniak also sees a continuing place for touchpads."
I don't want to have to yell at my phone in public, I don't want to have to remember which keywords to say.
I have google voice search and other the other crap it does, I never use it. It is far easier, faster and less annoying to myself and others to type in what I want.
We can have more people yelling into their phones. "Call Frank. No, not Balls Sank! CALL FRANK!"
I'm a good cook. I'm a fantastic eater. - Steven Brust
NOISE REJECTION... the Iphone voice control works great with my bluetooth helmet when below 40mph but as soon as I hit highway speeds it stops responding to my commands.
It's great to be able to ride and change songs, make and receive calls, but I'd love to be able to also select podcasts, that right now does not work. only playlists, or you need to manually start your podcast, and it will play all in that podcast folder.
Do not look at laser with remaining good eye.
There are things I would love to do with voice on a mobile device. Play lists, nav, texting, dialing. What I do want is to live a world full fo people talking to their phones or themselves. Can you imagine a mall full of people using voice to text?
Or more simply hell.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Good idea, but too late for Woz. He could have tried "Phone, find me some proper dance steps for the Argentine Tango."
Trolling is a art,
+1 concur.
My Mac had voice commands in the 90s, but it didn't include useful stuff like being able to choose which file was selected etc, despite being able to open and close files/windows with it. Even if it could do everything that your mouse and keyboard can do, it's still faster to just use your hands for the most part. Voice command is great for people with disabilities, but on a smartphone in a busy environment, what's the point? It's either not going to work because of ambient noise, or you're just going to piss everyone off.
Bonus clip.
which is totally what she said
Exactly my experience as well. Some limited voice commands I have found useful (play song, call this person, etc) but the others are often too much broadcasting of my activities
Presumably this is just part of some prank. We'll see who he gets to think their cell phone is out shopping for groceries.
This cannot be a replacement for tactile input. While using voice commands to accomplish a task can be very useful in some situation there are many places where it would be inappropriate or inconvenient. Not to mention typing it faster than talking and using a mouse allows for precision when it is needed.
A truly well developed voice command system would be a boon to accessibility and convenience in many situations but cannot fully replace the older styles of input.
When we have a useful thought interface THEN we can probably ditch the keyboards and mice once and for all.
I'll meet you at the intersection of "Should be" and "Reality"
Shut up friends. My internet browser heard us saying the word Fry and it found a movie about Philip J. Fry for us. It also opened my calendar to Friday and ordered me some french fries.
I just invaded Grammar Czechoslovakia and duped Grammar Neville Chamberlain; now it's on to Grammar Poland.
The situation with people yelling into their phones is already way too annoying when sitting in a restaurant, bus, or other public place without adding more vocal static to the background.
I propose working on a Cone of Silence add-on for cell phones, or maybe a neural headset transmitter-to-speech accesory.
Join the Slashcott! Feb 10 thru Feb 17!
irrelevant commentator is irrelevant
Even among humans, how many times one must ask for someone to say it again.
Giving voice commands to a computer will get you a row boat when you ask for a robot and tickets for a nudist play when you want a new display.
Let me guess:
iVOICE?
Yours In Miami,
K. Trout
This pretty much sums up my experiences with voice recognition technology.
If I seem short sighted, it is because I stand on the shoulders of midgets
Kinda surprising that they will be first to have a fully voice activated phone. Either way, it's coming to everybody.
The price is always right if someone else is paying.
use it every day when driving, it just works...
"hey vlingo, call steve"
http://www.vlingo.com/
http://slashdot.org/~GuyFawkes/journal
Some of the basics are just fine - Like when I'm driving it's convenient to say "Call ***** mobile" and have it ring my girlfriend. But for most other applications (games, calendar, movies), I'm going to have to interact with my phone anyway just because of the nature of the activity. What use is it to say "Play Fight Club" if I'm not going to be holding and viewing the device?
He's getting rather old, but he's a good mouse.
If you have to remember keywords, it's not the sort of system I think Woz is ultimately talking about. How many years until IBM's Watson will fit entirely within your cell phone? Imagine something that you could chat with under your breath as if it were a person, not something like the voice command software of the 1990s.
Imagine being able to mutter under your breath "now how do I get to the doctor's office?", and your phone presenting a little notification that it heard you and has an answer if you're interested, and then you tap on that and see a map with three routes plotted out and ranked.
Some day.
+1 concur.
My Mac had voice commands in the 90s, but it didn't include useful stuff like being able to choose which file was selected etc, despite being able to open and close files/windows with it. Even if it could do everything that your mouse and keyboard can do, it's still faster to just use your hands for the most part. Voice command is great for people with disabilities, but on a smartphone in a busy environment, what's the point? It's either not going to work because of ambient noise, or you're just going to piss everyone off.
Bonus clip.
I know "The Woz " is a geek favourite, and he certainly has technical prowess BUT the man is also a self aggrandising fool who has a bad habbit of exaggerating things. He wrote a book "How I Invented the Personal Computer And Had Fun Doing It" for feck sake. The man for all his prowess did NOT invent the PC. I'm sure I'll be modded into oblivion but it has to be said. I wouldn't take any of his predictions seriously.
These posts express my own personal views, not those of my employer
While it seems we have many people here that would make good slapstick writers, I've never had to comically yell at Google search to make it work. I simply say:
"Call Eve"
"Text Eve, I'll be home in 20 minutes"
"Map of Walmarts"
"Directions to Walmart"
"Navigate to Walmart"
"Note to self, post something on Slashdot tonight"
"Listen to Beethoven"
"Go to Wikipedia"
And other very useful commands. Yes, most of the time I use my fingers (especially in public), but there are still many times that the voice commands are invaluable.
I was actually thinking about this earlier today. Admittedly, I do love how accurate speech-to-text is on my Android phone; typing out a text message *is* a lot quicker that way. However, my thoughts were more focused on how "For sales, press 1. For support, press 2. For billing, press 3." has been replaced with an automated voice (invariably female) who says "tell me what you're calling about", and then hasn't the slightest idea what to do when I say "technical support" or "representative"...or tries to evade actually sending me to one. Here's the epiphany I had:
Voice activated interfaces, independent of an AI, is a command line.
The commands are different, and generally optimized for things people are likely to say rather than minimizing typing, but when you boil it down the user must know the commands the computer will recognize. Until a viable means of comprehension is paired with voice activation, all we've got is a different means of doing the very thing that GUIs were designed to move us away from.
Now I'm not saying that moving in this direction is a bad idea, like I said dictating text messages is a good thing. However, I think that there are other reasons why 100% voice interactivity will never be fully actualized, and they're not technical. If a person is texting in a movie theater, sitting toward the back with their phone on silent and the brightness low, it's not disturbing a whole lot of people. When I'm on the train, everyone is reading m.cnn.com, catching up on e-mail, or watching a video, and it's mostly quiet. Imagine half the train trying to dictate a URL? How do you play Angry Birds with your voice? Have you ever texted the person sitting next to you for the very reason that what you're saying isn't appropriate to be said aloud? The list continues like that.
Voice commands work pretty well on my Windows phone. "Call xxx Home" never fails. Well, actually it did fail when I first set up the phone for English UK, (I want my U in colour) but it was expecting an English accent, and wouldn't respond to my Canadian. I'm wondering how it works in the Southern US. "Y'all Call Home"
I call my girlfriend ***** too.
is Scotty talking into the Mac's mouse.
I tell my iphone - Call. Jane. Bonner.
iphone says - Calling Dave Norwood.
Iphone voice dialling totally useless at the moment. Maybe they should fix that first.
None of them can see the clouds; The polished wings don't care.
higher, higher, a tad higher. No, Lower. just a smidge higher. There! Thats it! Now pull back, more, more, a little less good. Ok Shoot.
First recognized command
"Find Wankage Material"
Second recognized command
"fap...fap...fap...uhhhuhhhhhhh"
Third recognized command
"zzzzzzzzzzzzzzzzzzzzzzz"
Sig Follows: "Suppose you were an idiot. And suppose you were a member of Congress. But I repeat myself." -- Mark Twain
Even if it was as good as the human the failure rate would still be too high. Instead of whispering, typing is more accurate and much faster.
http://dilbert.com/strips/comic/1994-04-24/
Welcome to the Panopticon. Used to be a prison, now it's your home.
FORMAT C ENTER!
Bow before me, for I am root.
Watch, apple will force this to be the only way to do things on their devices, and force their customers who buy their crap anyways instead of taking a stand. Its what they do.
Any more such brilliant insights, Mr. Woz? And who do you suggest Apple should try to rip-off this time to get the technology?
Of course, if Apple's foray into handwriting is any indication, Apple will "solve" this problem by having us speak in Morse code, just like they didn't manage to get a decent handwriting system together.
We need the Woz to tell us this? Google's Voice Commands on the Android have been out for a while now, and GOOG-411 was out for years before that. This was a long term initiative for them and they're way ahead of anyone else. - www.awkwardengineer.com
Agreed. Have you ever noticed how in Trek, computer voice interactions were generally limited to single actors in a given scene, generally either the ranking officer or a technical expert? Something like that might work.
The communicator was also voice-commanded, of course. But they never tried it in a bar. :)
-- IANAL, this isn't legal advice, and definitely isn't legal advice for you. Also, Squee!
Over a decade ago, there was a really good voice-controlled phone system called Wildfire (audio demo). It took a lot of computer power for the time, it was an expensive service to provide (racks of machines in the central office) and originally cost about $5 to $10 a day. It let you juggle multiple calls and callers through a very fast-responding voice interface.
Orange, the European mobile provider, offered Wildfire as an extra-cost service from 2000 to 2005, then discontinued it over customer objections. Then Microsoft bought the company behind Wildfire, did nothing with the technology, and closed it.
Today, it should be possible to put Wildfire capability in a phone. So what Woz is proposing is really 1998 technology.
My phone smells funny and tastes like eww.
Not noteable, IMO a rubbish article.
Nothing like trying to call a friend late night on a Saturday, and instead your phone thinks it's a great idea to call your landlord instead
We all need to forget Woz and somehow resist the urge to rent multiple copies of 'Sphere' this weekend.
Some of the basics are just fine - Like when I'm driving it's convenient to say "Call ***** mobile" and have it ring my girlfriend. But for most other applications (games, calendar, movies), I'm going to have to interact with my phone anyway just because of the nature of the activity. What use is it to say "Play Fight Club" if I'm not going to be holding and viewing the device?
Thankfully it would be optional for those instances. I actually see quite a bit of use for it and was disappointed with my iPhone's meager amount of commands that I never use and the activation of which takes too much effort, is not smooth. If I could treat it more like a personal assistant than an input device it would be nice, esp when driving.
Ala star trek, "computer, captains log, make appointment on 15th of June, 3pm for tennis lessons".
Computer, "Text Dave, 'hey dave want to play a game?"
Computer, directions to business xyz.
Computer, What is the 37.56 divided by 7?
Computer send email to friend B, attach spreadsheet 123, send., etc.
Why would we use speech with our phone?
(Returns to typing his blog on his iPhone...)
It's pronounced ASTRID, Walter!
Eloi are stupid, throw morlocks at them!
Slashdot comments will be filled with those who believe it must be an all or nothing concept and can't imagine talking to their phone all the time and make outlandish scenarios where like minded editors with karma mod them up. ...
While the rest of us see it and think yeah, I had that same idea about 5 years ago and would find it very useful in certain situations I run into frequently. At least I'd have it as an option when I couldn't use my hands or was multitasking. Nice article and then move on to the next
Android's voice commands do three of those:
send text to
directions to
send email to
God is imaginary
"Well, we can always hope that adequate social pressure will prevent most people from doing stupid/annoying/obnoxious things, as with anything in life. "
You mean like having their ringer on in the movie theater? Have loud private conversations on their cell phone while standing in line at the store? Like playing loud profanity filled music from their car while parked at the 711 at 10am on a Saturday morning?
Where do you live because I want to move there.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Agreed. Until computers reach the point where they can process and understand natural language like we see happen in Star Trek, I just don't see how voice can replace touch for all but a handful of situations (such as when hands aren't available while driving, exercising, etc.). If natural language processing can become capable enough such that we can speak to computers normally and get the output we expect, then sure, it'll have a place. But until then, cell phone use has showed us that talking to someone or something that isn't present is inappropriate, and that others have a low tolerance for it.
An episode of 30 Rock had a voice-controlled TV and it started responding to voice commands from actors on the episode of Law and Order it was currently playing. Then Jack, in disgust, said "Garbage!" and the TV switched the channel to Keeping Up with the Kardashians.
The idea is to use speech to do complex stuff or answer questions that would take multiple screen based input steps....
"order me a pizza" (phone leverages location data, payment data, etc to order)
"how long will it take me to drive to Frank's house from here?" (phone responds with time/mileage/cost)
"When was the last time Mike called? (phone responds with date/time/call length etc.)
"play me some Jay-Z music"
"add an appointment for next Wednesday to see the doctor at 8am, alert me the day before"
This is the kind of stuff that Apple is hopefully aiming for.
My God can beat up your God. Just kidding...don't take offense. I know there's no God.
I remember reading about an interesting concept a loong time ago in in Dave Duncan's book "Strings".
Basically, the computer took a sample of your normal talking voice, then a sample of what they called 'command voice' or something. When characters were communicating to the central computer they'd simply use their command voice instead of their regular voice. The computer was able to tell which user was requesting what action based on voice identification, and would ignore regular speech unless instructed not to (e.g., "Start dictation It was a dark and stormy night...")
It seemed a pretty elegant solution to the whole 'how does the device know you're talking to it' issue, as long as the computer is able to a) positively identify your voice when compared against others and b) positively identify the difference between your 'normal' voice and your 'command voice'. I have no idea if such realtime discrimination will ever be possible, I just thought it was a cool idea.
Of course, looking at the current fail rate of simple voice command recognition, I'd suspect that the software has a longer way to go than does the hardware...
"I love animals! Some are cute, others are tasty, what's not to like?" - Betsy Schroeder, Jeopardy contestant
I am having really for smt to log my daily actions and using the keyboard for that seems silly and unnecessary. F.e. I just want to say to my phone, "Log this, the gardener started working today". And 1 year later it can tell me "Increase his fee" Does this seem silly to you? I am so f. in need of smt. like this. Nexus S seemed like the sh. but it wasn't what I was hoping for. But some good devs can extend this usage. Oh shit, I just gave an excellent idea to someone!
The environmental inputs change unpredictably, the data input rate is a snail's pace, there's too much variety in users' voices, accents, dialects, and vocabulary, and it requires massively more processing power than a keyboard or touch interface. It's as if your kid asked for a mouse for a pet and you gave them an schizophrenic incontinent three-legged elephant.