The Future of Speech Technologies
prostoalex writes "PC Magazine is running an interview with two of the research leaders in IBM's speech recognition group, Dr. David Nahamoo, manager of Human Language Technologies, and Dr. Roberto Sicconi, manager of Multimodal Conversational Solutions. They mainly discuss the status quo of speech technologies, which prototypes exist in IBM Labs today, and where the industry is headed." From the article: "There has to be a good reason to use speech, maybe you're hands are full [like in the case of driving a car]. ... Speech has to be important enough to justify the adoption. I'd like to go back to one of your original questions. You were saying, 'What's wrong with speech recognition today?' One of the things I see missing is feedback. In most cases, conversations are one-way. When you talk to a device, it's like talking to a 1 or 2 year old child. He can't tell you what's wrong, and you just wait for the time when he can tell you what he wants or what he needs."
I've been waiting for years for speach recognition technology to get to an acceptable standard and over that time I've used a couple, the one i got lately (dragonsoft I think) was ok, but they need to come quite a bit further before I'll be adopting all the way.
I'm looking forward to when I can say "computer, open openoffice for me mate" and it'll go "sure"... That'll be sweet.
*''I can't believe it's not a hyperlink.''
Dragon Naturally Speaking is a baby step in that direction, but it is pretty much limited to single nouns or verbs.
Human being (n.): A genetically human, genetically distinct, functioning organism.
I'm convinced speech technologies have a fantastic future when they are used for improving human communications like providing for an electronic bablefish. However it looks like most are concentrating on using speech as a way to interact with machines.
Which is so terribly ineffient and cumbersome. You really don't want to spend the time to socially interact with your coffeemachine at 7am.
Unless it's able to go to the shop, put in exactly the right amount of coffee and is able to turn itself to on once it hears you stumbling out of bed. It's next to useless if the only added value is to switch itself to on after you grunted "on" to it.
I think mouse and keyboard with screen is far faster than audio recognition/feedback will ever be.
Something that has not been mentioned, because, evidently, no one has actually worked with it, is that it is seriously annoying to work in the proximity of someone USING speech recognition. I worked with a fellow that had speech recognition on his machine who used it for programming. YOU try working on YOUR own code when someone is droning in the background: "for left paren int i equals zero semi-colon i less than mumble mumble delete word delete word ..." ALL DAY LONG! Even with head phones on it sometimes seemed like he was asking a question and I'd remove the head phones and say "What was that?" "Nothing delete word". ARGGHHH. Leave me the heck away from people with speech recognition.
Tom.
I know nothing about the particular details of this deal, but wouldn't it make sense if IBM's sale of the patents also included a reciprocal agreement, that Scansoft would not sue IBM in the future for use of it's IP?
It just seems like IBM, seemly a company obsessed with creating and preserving intellectual capital, wouldn't so hastily sell off patents that they might ever be able to use / need, unless there was a catch, like they got access to Scansoft's portfolio as part of the bargain?
Just speculation, based on what I've read about how Big Blue operates.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."