I completely understand and agree with you; hopefully as you research my background, you'll notice that I've always been an advocate of "multimodal" interaction, from the standpoint of giving users a choice based on their personal preferences, operating environment, device capabilities, etc.
Yap has been architected from the ground up to be perfectly useable for either manual, voice, or a combination of both input methods (and others that we can't reveal just yet). You decide what's best for you (we're not that arrogant where we know that up front for all potential cases).
That said, we sincerely appreciate the community's feedback. There'll be some exciting things we can do with a free-form platform to support expanding this capability for all you developers out there, so watch this space!
...the point of our multimodal work is that you can have a two way dialog with the device, as well as have visual feedback to the interaction. See http://ibm.com/pvc/multimodal for some examples.
So, in my capacity as an official spokesperson, I'll clarify:
"...IBM technology *is* being used to control computers and devices..."...by customers buying Hondas, GMs, XM Radios, and in quite a number of enterprises within banking, healthcare, etc.
...our speech-enabled Web browsers for mobile devices and set top boxes. More info on them here: http://ibm.com/pvc/multimodal
Not only do they allow you to navigate by voice, but using X+V (a blend of XHTML and VoiceXML), you could have fully speech-enabled Web apps. Example: "show me nearby sushi restaurants" or "movie schedules in my area".
This technology is also useful for mobile field/sales forces, technicians that need ready access to schematics, really anywhere you need handsfree access to information (think a police officer running your plate before pulling you over for speeding:-P).
Multimodal Web using X+V & Embedded ViaVoice
on
Opera 8 Released
·
· Score: 1
This is the first public release that includes voice functionality as well! Using XHTML+Voice (X+V), you can create multimodal Web pages (i.e. pages that you can speak to and can respond back as well). Examples include asking webmail for urgent email then having them displayed or asking for tomorrow's calendar. The true power of this is when you combine X+V with Google or Yahoo! APIs. Imagine just asking your Web browser for movie listings, or latest IBM news. More info available & an Eclipse-based toolkit for creating X+V content available here: http://www.bm.com/pvc/multimodal
Not because we'll force you to but because you'll *want* to, big difference!:) I think we've suffered with WAP long enough, it's clear "normal people" aren't willing to trudge through 10+ screens to find the nearest [Starbucks|UPS|etc], get order status, etc.
I can't give out metrics per say, but we're able to get our speaker independent technology up near 100% accuracy in many trials (obviously that depends on the accent of the speaker, the grammars, etc.). This is also due to the quality of the acoustic model used. Also, you have to realize that while this may share the same branding as previous products, this incarnation of Embedded ViaVoice has been rewritten from the ground up to be the most efficient and advanced micro-ASR/TTS available on any platform (it's used in Hondas & GMs, tough customers that demand performance).
We're big fans of that commercial! We consider it a neat challenge, especially when the cellular device makers realize what their competitors are up to!;) As far as microphones are concerned, the ones currently shipped w/ cells should be sufficient, since they have built in noise cancelling. You can also download a mini browser for Pocket PC to test your mobile apps today from our site (also one for the Sharp Zaurus Linux PDA).
Not sure about Star Trek, but we've made some great strides recently. Can the average developer help? Absolutely! We're trying to deliver an end-to-end ecosystem here, between devices, tools+browsers, and content; all 3 of those are important. Download the toolkit and multimodal enable your sites, especially the content that will be relevant to mobile users! As far as your last question goes, I can't comment on unannounced products but...;) The next year should be very exciting in this space!
CTTS output will be used shortly which is almost human-like in it's quality, so you can banish those robots for good (the best example of that we have today currently ships w/ certain models of Honda autos).:) Some very impressive things will be announced around these initiatives soon, since it's clear the market will move aggressively to adopt this in the embedded & mobile spaces.
I'm the IBM program director over this product, working in partnership w/ Opera. Some quick comments: The X+V spec unifies HTML & VoiceXML and is currently undergoing the W3C process for standardization. We wrote it together w/ Motorola & Opera and have made it open. We also have an Eclipse-based SDK available at http://www.ibm.com/pvc/multimodal and a prototype one at http://www.alphaworks.ibm.com/tech/mmtplus that allows you to visually build these multimodal apps.
Some of you may wonder why you should voice enable your Web content. First of all, one of my lead researchers is blind, and it's quite amazing to see how much he can accomplish today. Given that, in the future, I'm hoping a lot more content will be open to people with various disabilities.
Secondly, how useful is your cellphone for accessing the Web? It has a small screen & limited input. Now imagine just speaking into a multimodal portal: "weather forecast", "my portfolio", "eBay bids", "any high priority mail?", "am I free tomorrow at noon?", etc. The portal understands your input & fetches relevant info, which may also be tied into location based services. 50% of you will use multimodal services by 2010; this is intended as the replacement to WAP.
This is part of a larger effort to "speech-ify" the entire web using existing W3C standards such as XHTML and VoiceXML which have been combined into one called X+V.
Which XScale does it have, the PXA250 or 255? Big difference in speed as the 255 has a 200mhz bus to the 100mhz bus of the 250. Anyone know, specs aren't specific?
Uh, let's keep certain comments within the family, OK? You have a problem with it use internal channels to route it to the appropriate contact who can help with your issues. If you have never tried providing feedback, please do so, it'll alleviate your concerns.
Had one for a time as well but found it too sluggish. It was too much of a pain to dial in to IM someone. And the lack of a keyboard was frustrating (I prefer Blackerry-style input) but YMMV.
Decided to return it and wait for always-on 2.5G and a convergence device that included a built in kb (Treo or Danger).
I completely understand and agree with you; hopefully as you research my background, you'll notice that I've always been an advocate of "multimodal" interaction, from the standpoint of giving users a choice based on their personal preferences, operating environment, device capabilities, etc.
Yap has been architected from the ground up to be perfectly useable for either manual, voice, or a combination of both input methods (and others that we can't reveal just yet). You decide what's best for you (we're not that arrogant where we know that up front for all potential cases).
That said, we sincerely appreciate the community's feedback. There'll be some exciting things we can do with a free-form platform to support expanding this capability for all you developers out there, so watch this space!
i.
(yap's ceo)
...the point of our multimodal work is that you can have a two way dialog with the device, as well as have visual feedback to the interaction. See http://ibm.com/pvc/multimodal for some examples.
So, in my capacity as an official spokesperson, I'll clarify:
...by customers buying Hondas, GMs, XM Radios, and in quite a number of enterprises within banking, healthcare, etc.
;-)
"...IBM technology *is* being used to control computers and devices..."
No conditional statement, is that better?
...our speech-enabled Web browsers for mobile devices and set top boxes. More info on them here: http://ibm.com/pvc/multimodal
;-)
Not only do they allow you to navigate by voice, but using X+V (a blend of XHTML and VoiceXML), you could have fully speech-enabled Web apps. Example: "show me nearby sushi restaurants" or "movie schedules in my area".
We also released our Multimodal Tools Project for Eclipse a couple weeks ago: http://alphaworks.ibm.com/tech/mmtp
Go ahead and play.
Was this demoed on stage during the keynote? Anyone know where to download from? Thx! i.
Typo...the correct link is http://www.ibm.com/pvc/multimodal/
:-P).
This technology is also useful for mobile field/sales forces, technicians that need ready access to schematics, really anywhere you need handsfree access to information (think a police officer running your plate before pulling you over for speeding
This is the first public release that includes voice functionality as well! Using XHTML+Voice (X+V), you can create multimodal Web pages (i.e. pages that you can speak to and can respond back as well). Examples include asking webmail for urgent email then having them displayed or asking for tomorrow's calendar. The true power of this is when you combine X+V with Google or Yahoo! APIs. Imagine just asking your Web browser for movie listings, or latest IBM news. More info available & an Eclipse-based toolkit for creating X+V content available here: http://www.bm.com/pvc/multimodal
We're working on it...be patient. ;-)
http://www.opera.com/pressreleases/en/2005/02/21/
Not because we'll force you to but because you'll *want* to, big difference! :) I think we've suffered with WAP long enough, it's clear "normal people" aren't willing to trudge through 10+ screens to find the nearest [Starbucks|UPS|etc], get order status, etc.
I can't give out metrics per say, but we're able to get our speaker independent technology up near 100% accuracy in many trials (obviously that depends on the accent of the speaker, the grammars, etc.). This is also due to the quality of the acoustic model used. Also, you have to realize that while this may share the same branding as previous products, this incarnation of Embedded ViaVoice has been rewritten from the ground up to be the most efficient and advanced micro-ASR/TTS available on any platform (it's used in Hondas & GMs, tough customers that demand performance).
;) As far as microphones are concerned, the ones currently shipped w/ cells should be sufficient, since they have built in noise cancelling. You can also download a mini browser for Pocket PC to test your mobile apps today from our site (also one for the Sharp Zaurus Linux PDA).
;) The next year should be very exciting in this space!
We're big fans of that commercial! We consider it a neat challenge, especially when the cellular device makers realize what their competitors are up to!
Not sure about Star Trek, but we've made some great strides recently. Can the average developer help? Absolutely! We're trying to deliver an end-to-end ecosystem here, between devices, tools+browsers, and content; all 3 of those are important. Download the toolkit and multimodal enable your sites, especially the content that will be relevant to mobile users! As far as your last question goes, I can't comment on unannounced products but...
Warm regards,
Igor Jablokov
Here's a link to a recent VoiceXML Review article:
d ee pblue.html
- 1. html
http://www.voicexmlreview.org/Nov2004/features/
Another at Speech Technology Mag:
http://www.speechtechmag.com/pub/industry/10777
Both describe how Extreme Blue interns used X+V & Opera to create some sample mobile Web apps.
Igor Jablokov
Forgot to include feedback links. :)
4 7b 2a7f796603541134f9feaae4a8e1&forumid=95
h .m ultimodal/
Be sure to post your questions to Opera here:
http://my.opera.com/forums/forumdisplay.php?s=e
or to IBM here:
nntp://news.software.ibm.com/ibm.software.speec
Thanks!
Igor Jablokov
We knew you guys would make 2001 references, so under Tools->Preferences->Voice edit the Opera Standard profile to start commands w/ Hal. :)
Igor Jablokov
We cannot comment on unannounced products but... ;)
CTTS output will be used shortly which is almost human-like in it's quality, so you can banish those robots for good (the best example of that we have today currently ships w/ certain models of Honda autos). :) Some very impressive things will be announced around these initiatives soon, since it's clear the market will move aggressively to adopt this in the embedded & mobile spaces.
Igor Jablokov
Hi all,
I'm the IBM program director over this product, working in partnership w/ Opera. Some quick comments: The X+V spec unifies HTML & VoiceXML and is currently undergoing the W3C process for standardization. We wrote it together w/ Motorola & Opera and have made it open. We also have an Eclipse-based SDK available at http://www.ibm.com/pvc/multimodal and a prototype one at http://www.alphaworks.ibm.com/tech/mmtplus that allows you to visually build these multimodal apps.
Some of you may wonder why you should voice enable your Web content. First of all, one of my lead researchers is blind, and it's quite amazing to see how much he can accomplish today. Given that, in the future, I'm hoping a lot more content will be open to people with various disabilities.
Secondly, how useful is your cellphone for accessing the Web? It has a small screen & limited input. Now imagine just speaking into a multimodal portal: "weather forecast", "my portfolio", "eBay bids", "any high priority mail?", "am I free tomorrow at noon?", etc. The portal understands your input & fetches relevant info, which may also be tied into location based services. 50% of you will use multimodal services by 2010; this is intended as the replacement to WAP.
Warm regards!
Igor Jablokov
http://www.ibm.com/pvc/multimodal
This is part of a larger effort to "speech-ify" the entire web using existing W3C standards such as XHTML and VoiceXML which have been combined into one called X+V.
Which XScale does it have, the PXA250 or 255? Big difference in speed as the 255 has a 200mhz bus to the 100mhz bus of the 250. Anyone know, specs aren't specific?
Thanks!
Uh, let's keep certain comments within the family, OK? You have a problem with it use internal channels to route it to the appropriate contact who can help with your issues. If you have never tried providing feedback, please do so, it'll alleviate your concerns.
Why not, they already come with a laser sight ;-)
o .a sp
http://www.microsoft.com/hardware/mouse/wie_inf
1. Leak pictures ...
2.
3. Profit
It'll be at least a couple years before we'll start seeing the pervasive device market (3G phones/wireless PDAs) pick up again.
...to get those extra repeat customer discounts? And here I thought all those soccer moms were just being lewd. ;-)
Had one for a time as well but found it too sluggish. It was too much of a pain to dial in to IM someone. And the lack of a keyboard was frustrating (I prefer Blackerry-style input) but YMMV.
Decided to return it and wait for always-on 2.5G and a convergence device that included a built in kb (Treo or Danger).
They make their $ margins on software... Just try to stay away from the sublime Halo, PGR, and DOA3.