IBM Develops Technology To Talk To Web

← Back to Stories (view on slashdot.org)

IBM Develops Technology To Talk To Web

Posted by ryuzaki0 on Monday March 16, 2009 @07:22AM from the dr-spaetzo-gets-a-job dept.

ProgramErgoSum writes to tell us that IBM's Indian-based research arm is trying to bring a new dimension to web interaction through voice interaction on your mobile phone. Developing a new protocol, Hyperspeech Transfer Protocol (HSTP), the hope is to allow users to talk to the web and get a response. Without more explanation I'm hoping this goes about as far as the gopher web. "The spoken web is a network of voice sites or interconnected voice and the response the company got in some pilot projects in Andhra Pradesh and Gujarat and the kind of innovations that people came up with were just mind-boggling, Gupta said. "

15 of 83 comments (clear)

Min score:

Reason:

Sort:

Interesting... by rockNme2349 · 2009-03-16 07:28 · Score: 2, Informative

but unnecessary. Instead of trying to create a new standard, what's wrong with sending an http request, and receiving an rtp response. Let the device do the text-to-speech conversion, like they do already.

I just can't imagine an entirely new protocol being adopted when it is already very possible using existing technologies...

--
Sewage Treatment Facilities - "Our duty is clear."
1. Re:Interesting... by Deag · 2009-03-16 07:37 · Score: 2, Interesting
  
  The text to speech bit could do with some sort of markup though. Despite the author's guild claim to the contrary, text to speech is very machine like and monotonous, it could do with some tags like <scared> or <angry> to get some emotion going.
2. Re:Interesting... by CarpetShark · 2009-03-16 07:49 · Score: 3, Informative
  
  Agreed. Especially since CSS has supported aural media (including multiple voices or generic speaker categories like "child", "male", "female" for different speakers in a story, for instance) for quite a while now.
3. Re:Interesting... by CarpetShark · 2009-03-16 08:06 · Score: 3, Informative
  
  There's a good (and recent) summary of the situation here:
  http://lab.dotjay.co.uk/notes/css/aural-speech/
  If you want an open source solution, you should probably look to the firevox (as opposed to firefox etc.) community. Otherwise, Opera is probably your best bet. As far as usage goes: I think it's still pretty limited, but definitely worth considering for future projects that need (or can benefit from) such features, rather than some proprietary solution. Especially since it's a relatively small amount of extra work that can be overlaid onto existing web pages.
4. Re:Interesting... by Deag · 2009-03-16 08:09 · Score: 2, Funny
  
  ah easy, blink would be said extremely quickly, whole sentence in one second. Marquee would be a loud street salesman sort of tone.
5. Re:Interesting... by CarpetShark · 2009-03-16 08:51 · Score: 2, Funny
  
  text to speech is very machine like and monotonous, it could do with some tags like <scared> or <angry> to get some emotion going.
  I believe the tag names, respectively, are going to be <enron> and <balmer>
6. Re:Interesting... by Hurricane78 · 2009-03-16 21:41 · Score: 2, Informative
  
  Which is a rule, but which is very very stupid, and just looks wrong to every human I have ever asked anyway. So fix your rules already. I have.
  
  --
  Any sufficiently advanced intelligence is indistinguishable from stupidity.
Dr. Doolittle claims prior art . . . by PolygamousRanchKid+ · 2009-03-16 07:30 · Score: 2, Funny

Talk to the animals? Talk to the Web? Same difference.

--
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
Achilles says "No." by girlintraining · 2009-03-16 07:33 · Score: 3, Interesting

Voice tech has an achilles heel: It's called accents. Most voice software works great for english-speaking people in the midwestern United States. But if you have an accent and have ever tried to "interact" with one of those voice mail systems that are speech-activated rather than touch-tone, the words unholy rage doesn't begin to describe the frustration of listening to a soothing voice repeatedly saying "I'm sorry, I do not understand your request" and then endlessly repeats the menus. Pressing '0', if you're wondering, will only make the system remind you that it (a) only speaks english and (b) while it can process touch tones, it won't -- because it hates you.
And IBM wants to bring this unique hell to the web? What kind of sadists are these people? As if websites that require Flash and the horrors that server-side Java unleashed wasn't enough...

--
#fuckbeta #iamslashdot #dicemustdie
1. Re:Achilles says "No." by DragonWriter · 2009-03-16 07:40 · Score: 4, Insightful
  
  Voice tech has an achilles heel: It's called accents. Most voice software works great for english-speaking people in the midwestern United States.
  If that's true of this software developed by IBM's Indian research arm and pilot tested in Andhra Pradesh and Gujarat, then I suspect it will also handle a lot of other English-speaking people.
  
  But if you have an accent
  As if English-speaking people from the midwestern United States don't.
2. Re:Achilles says "No." by lennier · 2009-03-16 08:56 · Score: 2, Insightful
  
  "English-speaking with a midwestern accent is generally viewed [BY AMERICANS] as the most easily understood amongst all english accents; And this accent is the one used for many (if not most) [AMERICAN] television reporters, voice recordings intended for mass [AMERICAN] audience, etc. Most other accents are defined [BY AMERICANS] by how they mangle certain syllables."
  Fexed thaht fah yah.
  
  --
  You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
I wonder by rootnl · 2009-03-16 07:34 · Score: 5, Funny

User: fap fap fap fap fap
Web: Oh Yea baby!
User: fap fap fap fap fap
Web: Wow that's it yea!

--

We are the people our parents warned us about.
Is this gonna be like CB radio? by Gizzmonic · 2009-03-16 07:43 · Score: 4, Funny

Breaker breaker, good buddy! Thanks for visiting my online speakin' site! My handle is: The Delta Lady! If ya'll wanna visit my cousin Watts' site, just say "bacon." If'n'ya wanna hear a special Christmas story about varmints pullin' Santa's sleigh, say "Merry Chris'mas, ya'll!"

--
(-1, Raw and Uncut is the only way to read)
Re:NO by revlayle · 2009-03-16 08:05 · Score: 2, Funny

NOTE TO SELF: code future IVR system to respond to "*6", instead, for operator requests.
Re:Waste of Bandwidth by CarpetShark · 2009-03-16 08:15 · Score: 3, Insightful

When you're talking about millions of terminals vs. relatively few servers, the "dumb" terminals are cheap. Also, doing good voice recognition requires beefy hardware -- probably, ideally, DSP/GPU accelerator boards or a google-style huge cluster of commodity PCs. Finally, for blind users, but also for others, listening to even the best synthesized voice gets tiring/grating after a while. It's much nicer to listen to good speech from a professional narrator, over even a normal human speaker, much less a "good" voice synth.
I still think it'd be better for everyone if they worked on supporting a globally usable standard that could be applied on any machine, like CSS aural media, though. TTS and voice recog is probably the future anyway, might as well start taking it seriously now.