Posted by
ryuzaki0
on from the voice-command dept.
malacai writes " IBM has announced that
ViaVoice will be available for Linux." Excellent-IBM does another good thing. Anyone played around with ViaVoice much? I'm interested in potentially using it-once my wrists fall apart.
I have seen this mentioned, but I want to ask a direct question. Does the design of GTK facilitate Speech recognition integration?
If we could get ViaVoice (or any other speech recognition software) to interface with the GTK toolkit well, you could suddenly have a huge number of applications that are speech enabled. Instead of having to make every application compliant... (or have to make it compliant to work WELL)
Integration into the Window Manager was one of the criteria that was discussed in some essay a while back about creating a flexible UI for the future.
ViaVoice is an excellent product (at least under Win32). Sometimes it amazes me as to how it understands what I dictate, of course other times it plainly has no clue. In general it's very good if you have time to go back and correct whatever it has written. It is not suitable as a complete replacement for typing, since it expects you to be dictating in a natural voice (e.g. infrequent stops/pauses between words). Telepathic speech isn't understood clearly by the engine. You would not be able to use this efficiently at a bash prompt or for coding. I suppose if you wanted to write your own grammar (which is possible with Win32 tools right now), you might be able to make a C or a Perl grammar, but moving around the code would be painful. Hopefully ViaVoice will integrate with most applications easily, as it does under Win32. Currently, you can speak to whatever textbox has focus under Win32, and if developers use the free SDK, more functionality (e.g. FONT BOLD ON) could be added to programs. I don't expect wordperfect to support ViaVoice, since they already seem to have a contract with Dragon Systems.
-- / \
\ / ASCII ribbon campaign for peace
x
/ \
ViaVoice: depends on the implementation
by
CodeShark
·
· Score: 4
While I am very interested in this announcement, the IBM voice technology I've worked with in Win32 (95 and NT) thus far is not sufficient for full-time use yet. I have used ViaVoice Gold for a couple of years now, and even with IBM's longest voice template "training", occasionally ViaVoice goes loopy and acts like it's dictating to itself, rather than translating from my voice. Thus I have not as yet been able to recommend the technology to my client customers.
However, the state of the art will obviously advance. Optical Charaacter Recognition (OCR) technology four years ago was a "probable buy", however the accuracy has gone up and cost down, so much that it is now a "should buy", and any company requiring significant amounts of document translation is behind the times if it does not have at least one employee competently using OCR.
In voice recognition, IBM is definitely one of the "to market" leaders, especially in the consumer area. My thoughts are that with the cleaner OS code in Linux may actually help IBM develop code that is much more powerful than the Win32 versions. IMHO the number one thing IBM can do to help ViaVoice succeed in the Linux arena (other than GPL'ing the code, which they probably will not do) is provide crystal clear documentation of the API and a powerful SDK to allow other programmers to develop "voice-drivable" applications. This would be similar to how IN-CUBE can be used to drive various applications from small voice commands. BTW, IN-CUBE is already available on Solaris, so maybe the Linux community can persuade CommandCorp to port their product (?)
The faster this technology develops, the better for all of us, especially the motion disabled who can use this technology as a true window to the world. The same group which produces ViaVoice also has a screen reader for the visually impaired which I would like to see in Linux as well.
Let IBM know of your interest, offer to act as a BETA tester, etc. The more we get involved in projects like these, the more quickly Linux will succeed in breaking the M$ stranglehold on the industry.
-- ...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
could be used for homeautomation
by
ianna
·
· Score: 3
Linux has all the potential to be the core of a homeautomation system... Voice control could be just one part of it...
Lots of sw is already available to control X-10 devices
Now we just need someone that integrates some function in PHP and we can controll the house via web.
Well, if Viavoice will provide voice controll and KDE a desktop interface, what will stop world domination even in this area?:)
Marco
The PROBLEM with Voice Recognition
by
Silex
·
· Score: 3
I purchased ViaVoice from IBM (for Win32) a while ago. The IDEA behind the technology is a good one. But the problem is, it's not only slower than typing, it's twice as frusturating, and may take up as much as 3x the time it takes to type the same document.
WHY?
(a) Accuracy -- My copy of IBM VoiceType came with a speaker-mic combination head set (made by Andrea). The documentation says that this is ideal for use with VoiceType. So I'm not going to blame my hardware for the inaccuracy of this product. It has trouble recognizing a lot of words. I don't have an accent, so that's not the problem. There are many technical reasons why this happens... but they don't matter to the enduser.
(b) Method of Speech: You can't just talking into the mic, like you normally talk. You have to pause between EACH word. But you MUST NOT pause or slow down while saying A WORD. This.. is.. a.. very.. unnatural way of speaking. Sometimes you forget to pause, or sometimes you accidently pause between multi-sylable words. This is one of the major causes of errors.
(c) Although this product DOES have support for editing the text through voice, it's quite impracticle. If you want to edit text that has already been typed, or you want to format text in a certain way, you're still going to have to use the keyboard, and possibly the mouse. You will find yourself trying to work with the mouse, keyboard and (now) trying to speak in a very unnatural way to the computer as well. It's not a matter of being HARD to do, it just doesn't make sense. It's easier to just type.
I think this application is not very usefull for typing large documents. What it IS usefull for is giving commands to the system through voice. I'm not sure how IBM plans on integrating this with Linux, because Linux systems vary greatly between eachother (unlike Windows, which has a very centralized control over the system, making it easy to make calls to all kinds of programs without knowing what the program really is). But if they can pull it off... maybe get it working with xterm or something, that would be great. And if they could get it working with an IRC and/or an ICQ client, that would certainly make life easier for many of us (that it would be kind of like a low-bandwidth alternative to audioconferencing... especially if you could get the IRC client to 'say' all the text as it scrolls by).
This is a good application, but the whole voicerecognition deal is really over-hyped. I hope IBM plans on porting some REAL software to Linux as well.
I have seen this mentioned, but I want to ask a direct question. Does the design of GTK facilitate Speech recognition integration?
If we could get ViaVoice (or any other speech recognition software) to interface with the GTK toolkit well, you could suddenly have a huge number of applications that are speech enabled. Instead of having to make every application compliant... (or have to make it compliant to work WELL)
Integration into the Window Manager was one of the criteria that was discussed in some essay a while back about creating a flexible UI for the future.
ViaVoice is an excellent product (at least under Win32). Sometimes it amazes me as to how it understands what I dictate, of course other times it plainly has no clue. In general it's very good if you have time to go back and correct whatever it has written. It is not suitable as a complete replacement for typing, since it expects you to be dictating in a natural voice (e.g. infrequent stops/pauses between words). Telepathic speech isn't understood clearly by the engine. You would not be able to use this efficiently at a bash prompt or for coding.
I suppose if you wanted to write your own grammar (which is possible with Win32 tools right now), you might be able to make a C or a Perl grammar, but moving around the code would be painful.
Hopefully ViaVoice will integrate with most applications easily, as it does under Win32. Currently, you can speak to whatever textbox has focus under Win32, and if developers use the free SDK, more functionality (e.g. FONT BOLD ON) could be added to programs.
I don't expect wordperfect to support ViaVoice, since they already seem to have a contract with Dragon Systems.
/ \
\ / ASCII ribbon campaign for peace
x
/ \
However, the state of the art will obviously advance. Optical Charaacter Recognition (OCR) technology four years ago was a "probable buy", however the accuracy has gone up and cost down, so much that it is now a "should buy", and any company requiring significant amounts of document translation is behind the times if it does not have at least one employee competently using OCR.
In voice recognition, IBM is definitely one of the "to market" leaders, especially in the consumer area. My thoughts are that with the cleaner OS code in Linux may actually help IBM develop code that is much more powerful than the Win32 versions. IMHO the number one thing IBM can do to help ViaVoice succeed in the Linux arena (other than GPL'ing the code, which they probably will not do) is provide crystal clear documentation of the API and a powerful SDK to allow other programmers to develop "voice-drivable" applications. This would be similar to how IN-CUBE can be used to drive various applications from small voice commands. BTW, IN-CUBE is already available on Solaris, so maybe the Linux community can persuade CommandCorp to port their product (?)
The faster this technology develops, the better for all of us, especially the motion disabled who can use this technology as a true window to the world. The same group which produces ViaVoice also has a screen reader for the visually impaired which I would like to see in Linux as well.
Let IBM know of your interest, offer to act as a BETA tester, etc. The more we get involved in projects like these, the more quickly Linux will succeed in breaking the M$ stranglehold on the industry.
...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
Linux has all the potential to be the core of a homeautomation system... Voice control could be just one part of it...
0 .5.3.tgz
:)
Lots of sw is already available to control X-10 devices
Heyu - http://www.prado.com/~dbs/
Xtend - http://www.jabberwocky.com/software/xtend/
TKx10 - http://www.houseofhack.com/tkx10/
WebX10 - http://members.tripod.com/~famewolf/webx10/
IR control is available using
http://members.home.net:80/ncherry/common/lirc-
Now we just need someone that integrates some function in PHP and we can controll the house via web.
Well, if Viavoice will provide voice controll and KDE a desktop interface, what will stop world domination even in this area?
Marco
I purchased ViaVoice from IBM (for Win32) a while ago. The IDEA behind the technology is a good one. But the problem is, it's not only slower than typing, it's twice as frusturating, and may take up as much as 3x the time it takes to type the same document.
... but they don't matter to the enduser.
.. is .. a .. very .. unnatural way of speaking. Sometimes you forget to pause, or sometimes you accidently pause between multi-sylable words. This is one of the major causes of errors.
... maybe get it working with xterm or something, that would be great. And if they could get it working with an IRC and/or an ICQ client, that would certainly make life easier for many of us (that it would be kind of like a low-bandwidth alternative to audioconferencing ... especially if you could get the IRC client to 'say' all the text as it scrolls by).
WHY?
(a) Accuracy -- My copy of IBM VoiceType came with a speaker-mic combination head set (made by Andrea). The documentation says that this is ideal for use with VoiceType. So I'm not going to blame my hardware for the inaccuracy of this product. It has trouble recognizing a lot of words. I don't have an accent, so that's not the problem. There are many technical reasons why this happens
(b) Method of Speech: You can't just talking into the mic, like you normally talk. You have to pause between EACH word. But you MUST NOT pause or slow down while saying A WORD. This
(c) Although this product DOES have support for editing the text through voice, it's quite impracticle. If you want to edit text that has already been typed, or you want to format text in a certain way, you're still going to have to use the keyboard, and possibly the mouse. You will find yourself trying to work with the mouse, keyboard and (now) trying to speak in a very unnatural way to the computer as well. It's not a matter of being HARD to do, it just doesn't make sense. It's easier to just type.
I think this application is not very usefull for typing large documents. What it IS usefull for is giving commands to the system through voice. I'm not sure how IBM plans on integrating this with Linux, because Linux systems vary greatly between eachother (unlike Windows, which has a very centralized control over the system, making it easy to make calls to all kinds of programs without knowing what the program really is). But if they can pull it off
This is a good application, but the whole voicerecognition deal is really over-hyped. I hope IBM plans on porting some REAL software to Linux as well.