TuVox Voice Interface
pablos writes: "NYTimes has an article about Tuvox who set up Handspring and Activision with voice interfaces for tech support. Apparently they can do away with the annoying 'press # now' menus. I've used things like TellMe, which played an ad everytime it didn't understand you, but I'm wondering if this sort of thing is starting to work anywhere. Anybody called Handspring for tech support lately?"
Last year the company I work for got into a project that used a Cisco 2600 with VOIP module and a product from IBM that allowed you to interact with a website via the phone...we used PHP to create the VoiceXML documents to drive the voice menus and we were scraping the data from a local site that had weather and traffic info on it...worked pretty well considering that it was also done in German :)
;)
I would have to agree that the technology is getting closer to replacing human beings...maybe I should go check my retirement plan now
..exactly new, is it?
:)
It's been a while since there was really much media hype about voice recognition technologies. Sure, the whole voice activated menu's "1, 2 etc." has been around for quite a few years, but I suppose there is a huge difference between repeating a few numbers than describing technical problems. I mean, is this literally a flowchart menu with various diagnostic paths or does it actually try and understand a sentance? If it's the former, then that is nothing more advanced than what is currently available and probably in use elsewhere.
I wonder what would be more frustrating, repeating yourself twenty times to a computer to battle through a menu, or sitting for twenty minutes trying to explain your problem to a ex k-mart 1st line support engineer. The choice is yours
"Never let the truth get in the way of a good story..."
Anybody called Handspring for tech support lately?
:)
That's the good thing about a Handspring.. You have no need to call tech support
I assume you mean Wildfire? Never have I laughed so much as when a friend of mine in a pub attempted to manage his voicemail. It must have taken 10 minutes to delete a message 'THROW IT AWAY!' Oh god, he looked such a fool.
And voice activated dialing - same person (this time at a club) tried to voice dial another friend - ended up calling his parents at 2:00am. They were not happy bunnies.
In the club this could be expected, but the pub was not too loud. The technology that Orange is using for Wildfire is just not up to scratch for normal use.
PS. There are some interesting 'features' in Wildfire (these phrases will not be exact, but play around with them): 'Do me a favour' gets the response 'What kind of favour?' you can then say 'I'm feeling depressed' which gets the response 'Why don't you tell someone who cares' or 'What does a cow say?' which gets the response 'MOOOOO!'
A friend of mine (from Australia) went to the US a year or two ago, and found himself needing to call a service which used such a system. When he did, he found that it could not understand his accent; after three unsuccessful attempts at doing an "American" accent, he gave up.
The moral of this story: make sure that there's a touch-tone menu to fall back on.
I've been using Sprint's voice dial service since it came out and its pretty effective. I occasionally have to repeat myself, but I've uploaded my whole address book and it is very good at figuring out names.
:*
:Ready
:Call John Smith at Home
:Calling John Smith at Home Correct
:Yes
For those of you unfamiliar with the system it works like this.
User
SPCS
User
SPCS
User
Done
Its super convient when you are in the car or running through an airport and don't have the time to look down at the phone. The reason I'm impressed with it is because you don't have to "train it" to your voice.
I've used TellMe's service quite a lot in the past. Driving directions, Movie listings, and just generally wasting time on the phone. It is a great service. I even played around with VXML, where I came up against the greatest current limitation with non-speach-to-text voice recognition systems:
They seem to be pretty much exclusively based on grammar files. Basically, you write out a grammar that lists all the possible things you think the person speaking would utter and then match them up to different branches in your system. Unfortunately, you can't easily take free form speach and store it as anything other than a sound file. This makes it difficult to do something such as allow the user to speak a message to send as an e-mail. The VXML engines have a great deal of heuristics to handle differences in speach style and tone, but without the grammar, you pretty much need to go through voice profile training to get decent results.
If anyone knows of kewl advances in this particular area, I'd love to hear them!
All I wanted was a rock to wind a piece of string around, and I ended up with the biggest ball of twine in Minnesota
SpeechWorks' OpenVXI, originally promoted as an open source VXML interpreter, has turned out not to be a good one. Speechworks developers maintain the code, and refuse to incorporate the patches and requests of the open source community, in favor of keeping OpenVXI tied to Speechworks products. The codebase could be forked, but it's really not worth investing the effort in such a brittle product tied to proprietary solutions.
Bayonne, the GNU telephony server, is great and getting better all the time. It currently supports a strong scripting language for DTMF applications, and Bayonne's XML plugin structure and built-in support for multiple telephony cards makes it the logical choice for open source VXML.
All that's needed at this point is to finish integrating Bayonne with an open source Text-To-Speech engine (most-likely candidates are Flite or Festival), Automatic Speech Recognition engine (in this case, Sphinx) and write the XML plugin. But there is a shortage of coders with the skill and time to do this.
I really think small business and the average Slashdotter could benefit from an open source VXML solution. Small businesses could create professional telephony apps that could make them much more competitive (from accepting credit cards securely over the phone to providing dedicated 24-hr support numbers for their products), while creative coders could use it for everything from Eliza-style chatbot answering machines to having your boxen call you up and describe a hack attempt as it's being made.
I'd love to see a VXML enabled Bayonne blow TellMe and others out of the water. If you're intrigued and you'd like to get involved, check out Bayonne's Sourceforge site and sign up for the mailing list.
He who refuses to do arithmetic is doomed to talk nonsense.