IBM Develops Technology To Talk To Web
ProgramErgoSum writes to tell us that IBM's Indian-based research arm is trying to bring a new dimension to web interaction through voice interaction on your mobile phone. Developing a new protocol, Hyperspeech Transfer Protocol (HSTP), the hope is to allow users to talk to the web and get a response. Without more explanation I'm hoping this goes about as far as the gopher web. "The spoken web is a network of voice sites or interconnected voice and the response the company got in some pilot projects in Andhra Pradesh and Gujarat and the kind of innovations that people came up with were just mind-boggling, Gupta said. "
but unnecessary. Instead of trying to create a new standard, what's wrong with sending an http request, and receiving an rtp response. Let the device do the text-to-speech conversion, like they do already.
I just can't imagine an entirely new protocol being adopted when it is already very possible using existing technologies...
Sewage Treatment Facilities - "Our duty is clear."
Microsoft bought TellMe (1-800-555-TELL), which does some of that. (Call it from a cell phone; the behavior on land lines is entirely different. From a cell phone, you can get movie listings, driving directions, etc.; on a land line, all it does is phone directories.)
Talk to the animals? Talk to the Web? Same difference.
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
Find me some porn. KTHX.
Do you know what I hate more than calling a phone number and talking to semi-incomprehensible Indians?
Calling a number and having a machine INSIST I speak to it like a person.
I just repeatedly jab "0" until I reach something that can pass a Turing test.
Whether Indians who don't understand English beyond what's written down for them counts is debatable.
Voice tech has an achilles heel: It's called accents. Most voice software works great for english-speaking people in the midwestern United States. But if you have an accent and have ever tried to "interact" with one of those voice mail systems that are speech-activated rather than touch-tone, the words unholy rage doesn't begin to describe the frustration of listening to a soothing voice repeatedly saying "I'm sorry, I do not understand your request" and then endlessly repeats the menus. Pressing '0', if you're wondering, will only make the system remind you that it (a) only speaks english and (b) while it can process touch tones, it won't -- because it hates you.
And IBM wants to bring this unique hell to the web? What kind of sadists are these people? As if websites that require Flash and the horrors that server-side Java unleashed wasn't enough...
#fuckbeta #iamslashdot #dicemustdie
User: fap fap fap fap fap
Web: Oh Yea baby!
User: fap fap fap fap fap
Web: Wow that's it yea!
We are the people our parents warned us about.
If you have ever been bounced from one customer service number to another, press 1.
Sorry, I did not understand your answer.
How is this in any way superior to voice recognition that happens on the phone and is translated into text or actions there instead of at a remote web server?
Breaker breaker, good buddy! Thanks for visiting my online speakin' site! My handle is: The Delta Lady! If ya'll wanna visit my cousin Watts' site, just say "bacon." If'n'ya wanna hear a special Christmas story about varmints pullin' Santa's sleigh, say "Merry Chris'mas, ya'll!"
(-1, Raw and Uncut is the only way to read)
That's interesting...I'm guessing you work on small-scale web applications. J2EE isn't for everything, but sometimes it is the only tool for the job.
April first is coming soon, that sounds like a pretty april foul to me.
People have been working on this sort of thing for a while now:
http://en.wikipedia.org/wiki/SpeechWeb
Most voice software works great for english-speaking people in the midwestern United States. But if you have an accent ...
I have a southeastern Michigan accent - essentially the same as the "standard radio/TV accent" (Cincinnati OH). It was chosen for that service because it makes ALL the American English phonetic distinctions (vs. for example an east-coast accent which merges "l" and "r" making Kennedys sound like they're saying Fidel heads "Cuber") and because it's intelligible to speakers of ALL the American English accents.
You'd think that a modern voice recognition system should be able to handle THAT, at least? Especially if it came in a vehicle manufactured in Detroit, right?
Just bought a For T150 Lariat. Great truck. Came with the "link" system by Microsoft. Does navigation, cellphone hands-free by bluetooth, ... Has voice recognition for control to keep hands on the wheel.
Darn thing has a horrible time recognizing my voice, even when I'm speaking carefully and clearly. (For instance: Tried to call home yesterday and it called the "identify a piece of music" number that Sony-Ericsson threw into my cellphone's phone directory. Doesn't ask for confirmation before making a call, either.)
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
From the RTFA,
Andhra Pradesh and Gujarat and the kind of innovations that people came up with were just mind-boggling, Gupta said.
Is IBM saying that these people from Andhra Pradesh and Gujarat are mind boggled when they are introduced to "Phone Mazes From Hell?" That the rest of us have had to endure from the Faceless Ones for years? Or is Gupta saying that these noble folk were mind boggled when they hear voices respond back on a cell phone?
Isn't the purpose of the internet to AVOID having to talk to people?
Why is an entire protocol needed for this?
Speech recognition + API + web = profit, this looks like quite an overblown effort, even from IBM
IBM had an addon or something for the Opera browser which was shipped with the Sharp Zaurus 5600 which took in speech and did recognition against web page stuff. I remember their demo having the ability to take in spoken orders for Pizza and flight reservations right into the browser. It worked pretty good but background noise was an issue from my experience.
It never went anywhere on the Zaurus mostly because the Zaurus didn't take off. Sharp attempted to build an open source software platform but didn't think those developing for the Zaurus would also want to use the Zaurus with their Linux computers.
But it sounds like IBM is digging this stuff back out and with Linux on more and more phones it makes it easier to do.
configure --target-platform Android; make multimodal
LoB
"Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
What good is any of this HTSP tech if the computer still can't parse speech into text or symbols? Speech recognition doesn't really work, not accurately enough for mass use on Web PCs or mobile phones. Even speech synthesis, a much easier problem, isn't really that great.
I smell another IBM submarine patent farm, not an actual "innovation factory".
--
make install -not war
We are experiencing unusually high call volume. Your estimated hold time is 345987 minutes.
-- I was raised on the command line, bitch
I remember that IBM shipped a voice recognition system built into OS/2 v3, which worked with the Workplace Shell and with most applications. You had to speak to it in a slightly clipped way (words just separated), but it worked quite well. It did not need training (at least for my lousy accent) unless I used specialized vocabulary, but with training it could even cope with really horrible enunciation (my drunken buddies). That was in the days of the i386 and primitive SoundBlaster digitization, so I would hope that techniques have improved since, such as being able to parse words which are run together.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
...wait until you get older and have to try and see that crap on those tiny screens with your old quadfocal eyes and try to type on them teeny designed for Japanese kids keyboards with your stiff fingers, then you *might* get a clue why a spoken way to interact with the web on those devices might be useful. I'd like that on my desktop, let alone some Lilliputian cellphone.
Now, don't get off my lawn, see that mower? Yank that cord and start pushing it and work off some of those cheetos!
It's Dr. Sbaitso. As in "Sound Blaster Acting Intelligent Text to Speech Operator". Wikipedia
Headline:
"IBM Develops Technology To Talk To Web "
Following-Up Story Headline:
"Web Talks Dirty To IBM"
Knowing Google's lust for data collection, the Soviet Union is still alive and well inside the psyche of Sergey Brin....
since August 2008: World Wide Telecom Web
VoiceXML is old news.
http://slashdot.org/articles/01/03/14/1622217.shtml
HTTP works great with them so why do we need a new protocal anyways?
I saw a presentation at SLT'08 (http://slt2008.org/Papers/viewpapers.asp?papernum=1191) about that voice web. Contrary to what slashdot readers seem to think, this is not an extension of the current www. They want to start over with speech only (input and output). It is designed so that people who can't read can use it, and they don't need the latest smartphone, just a regular phone. URLs would be replaced by phone numbers that you would dial. Then you would be able to listen to whatever "podcast" which would give you other numbers to dial. There is a big cultural difference with the www. They also intend to do speech-controlled basic navigation. However, they didn't give many details about the kind of technology involved, nor how to handle basic navigation problems like the equivalent of scrolling through a page or how to manage bookmarks, mark already visited sites and "download" stuff.