Slashdot Mirror


Voice-to-Text Options for Unix?

fingerLess asks: "I recently got pushed over the edge in keyboard use. I use Linux and wanted to find a good voice-to-text solution I can use on Linux on my laptop. It seems the IBM ViaVoice I found was still at 1.0 and there was even some questions if it was still available. But if it isn't being worked on, is it worth it? Has anyone tried any voice products running on top of one of the win-virtual machines and had had any success? My experience with those indicate top much of a performance hit in the AV department (AV seems not to be a real high priority with such products aimed at business or 'Office' productivity). Ideas?" For a while there, it looked like speech recognition was progressing at a pretty good clip, especially with Big Blue leading the charge. However I haven't heard of anything revolutionary happening with this technology for the past 2 years. Did I miss something, or has voice recognition on the desktop lagged.

20 comments

  1. ViaVoice still at 1.0 on linuux by Stone+Rhino · · Score: 0, Troll

    Just another instance of big blue's underwhelming support. But seriously, they likely consider it more of a server and mainframe OS, as they don't offer it on any of their desktops preinstalled.

    --


    Remember, there were no nuclear weapons before women were allowed to vote.
    1. Re:ViaVoice still at 1.0 on linuux by Stone+Rhino · · Score: 1

      This is not meant as a troll. I am merely stating what has already been put across many times on here at a place where it is relevant. IBM, for all its claims about supporting linux, still holds fast to windows on its desktops for business. I believe that its commitment to linux is limited to the server market, which is why they do not put out linux versions of much of their desktop software, i.e. ViaVoice.

      --


      Remember, there were no nuclear weapons before women were allowed to vote.
    2. Re:ViaVoice still at 1.0 on linuux by Anonymous Coward · · Score: 0

      OK, I'll bite, but this has got to be done anonymously, for obvious reasons...

      Yes our (IBM's) main aim is the server market, that's our main line of business, it's where we make our money. The desktop just loses money, we all but abandoned it when OS2 passed on (RIP).

      However, it doesn't mean IBM employees aren't playing around with stuff in the skunkworks. Just as Linux on OS390 came out of someone scratching an itch and hiding it from managment until it was marketable, so people are playing with the desktop. Some of it is running stuff under our version of WINE, e.g. Notes, and SmartSuite. Some is working on *nix apps to get them to the stage where they are usable. The ultimate aim for these projects is to replace the officially sanctioned Windows desktops we use internally with Linux. Once we have the goods, management may follow...

      And as for the app that started this thread, I'm not sure of ViaVoices current status, but we are very active in the research labs with voice recognition. If it's any indication, the global IBM work force has recently been asked to read test scripts over the phone to polish the recognition engine's coping with non-American accents. Give it a few months and see what happens...

  2. Not sure by __past__ · · Score: 2, Informative

    Try to have a look at Emacspeak. Perhaps that's what you want.

    1. Re:Not sure by Phork · · Score: 2

      no, that would be completely wrong. enacspeak is for output, not input.

      --
      -- free as in swatantryam - not soujanyam.
  3. New cool thing by perljon · · Score: 0, Redundant

    I just found this cool new thing called google. It is a 'search engine', whatever that is. Any ways, it allows you to type in key words that your interested in, and it spits back some web pages with related information. Try it out http://www.google.com

    --
    This isn't the sig you are looking for... Carry on...
  4. There is no Voice Recognition for Linux by Sam+Lowry · · Score: 3, Interesting

    There is a large vocabulary recogniton system, CMU Sphynx at http://www.speech.cs.cmu.edu/sphinx/

    However, This is probably not exactly what you are looking for as is not (yet?) suitable for Voice Recognition tasks.

    The problem with Voice Recognition is that it has always been a toy for most users and very few of those who buy Voice Recognition software do succeed to make a productivity boost. If you are one of them, you are a lucky guy as you have a good, distinguishable pronunciation, you work in a silent environment and use the mike shipped with the software. Since Unix world has a very practical view on things, I doubt there is many unix people out there that think Voice Recognition can be of use to them.

    Given the laziness of users and lack of training facilities, Voice Recognition is considered to stay an unprofitable buisiness for a long time. You can't even imagine how expensive it is to write a Voice Recognition software and collect the speech data for it...

    1. Re:There is no Voice Recognition for Linux by Anonymous Coward · · Score: 0

      I worked on a project with Sphinx two summers ago at CMU (as an undergrad, but anyway), It worked fairly well if you put the work into it doing sufficent training and voice model building stuff. It is an extremely flexible package, (we had built a recognizer for croatian) it can do whatever language/set of phonemes you wish to use, and it can do a WONDERFUL job(>99% for a specific person under good conditions) BUT it takes a HUGE amount of work to make it good, or to even make it useable - and the work is much more than point and click, it is very involved, but if you have a situation where you *must* have voice recognition, then the effort is probably worth it. i think. :-)

  5. IBM ViaVoice X by helixblue · · Score: 2

    The title said UNIX, and then later he said he was using Linux, so this may not be as applicable as he wanted.

    This morning I saw a review on IBM ViaVoice for MacOS X that piqued my interest. Overall, it looks like a pretty solid product for doing voice input into any program.. but can you imagine using vi without a keyboard?

    As a recent MacOS X convert -- it's good to have a UNIX with supported commercial apps.

    1. Re:IBM ViaVoice X by Anonymous Coward · · Score: 0

      but can you imagine using vi without a keyboard?

      You say "Escape Colon Q Excalmation Mark" as "Escape colon, queue excalmation mark." is added to your buffer.

    2. Re:IBM ViaVoice X by /dev/trash · · Score: 1
      That's just Cliff doing his editing. I once submitted a story ( that got accepted) that asked if there was GPS for Linux. It was posted as "Is there GPS for Unix?"

  6. Specialized Vocabulary... by NOT-2-QUICK · · Score: 3, Insightful

    In my view, one the primary obstacles that has yet to be overcome in the wonderful world of voice recognition (regardless of OS) is the specialized vocabulary that is required by the recognition software. By this, I am specifically referring the word syntax that the interface requires to achieve optimum performance.

    While we have all seen the world quite capably adapt to the Palm-Graffiti style of hand writing recognition, many vendors have found it to be a much more formidable task to modify the manner in which people speak. Beyond the several language variations (languages, accents, lisps, etc...), developers must also take into consideration much more subtle disparities in speech such as separate dialects within a given language. This has caused quite an immense dilemma, one that has prevented the mainstreaming of such technology!!!

    Even in the case of software such as Via Voice, the user is still given to the quite arduous task of creating a "dictionary" of sorts that recognizes their specific speech patterns and verbiage tendencies.

    All of these factors lead to complications and idiosyncrasies that the average Joe User is unwilling or unable to accept!!!

    --
    Beer is proof that God loves us and wants us to be happy. -- Benjamin Franklin
    1. Re:Specialized Vocabulary... by mdaniel · · Score: 2, Interesting

      I have followed Lojban for quite some time now and I think that it, or something like it, represents the future of human-computer voice interaction. It is parseable and phonetically spelled, making it very computer friendly.

      This does not beat the problem you brought up about Joe User, but for someone whose profession depends upon interaction with computers, learning a new way of typing (dvorak), writing (Graffiti) or speaking (Lojban) is a small price to pay. It even lends itself more toward the model of Shadowrunish futures where computer professionals are almost a separate race. :-)

  7. No good free solutions... by RadioheadKid · · Score: 1

    This stuff is pretty complex, and I doubt you will find anything that good for free...

    --
    "Karma can only be portioned out by the cosmos." -Homer Simpson
  8. Voice Recognition Win/UNIX by Anonymous Coward · · Score: 0


    Voice recognition hit a 'okay, nothing great' wall approximately three years ago.


    Dragon Naturally-Speaking is the Win state-of-the-art. It converts speech to text and you can create macros and you can also speech-activate the keyboard buttons. That does not necessarily means it is right for you. It is good, requires training, requires a BEEFY CPU and RAM, and 'should' get better over time. The problem is that it is slow if you are a fast speaker. This becomes distracting when you are 100 words along and it starts converting words 1-10. And, the dictionaries needed serious additions. I've used versions three through five. Five is good, but it could be much better. Dragon Systems was purchased and rehashed about two years ago. [BEWARE Dragon-authorized service providers who charge extra for what comes in the box!]


    IBM ViaVoice-Win is not bad. There were two versions: one trainable, the other stand-alone. The stand-alone was crap for me. The trainable version wasn't too bad. [I have NOT used ViaVoice-Mac.]


    Hark is the best of the best ... and it 'was' only available on UNIX. Hark recognizes speech and converts it and you can script based upon the recognition [If HEAR:Help SPEAK:"Activating help function" AND {FLASH:lights for 20 seconds} AND {SHOCK:student}]. Hark was available on SGI but they pulled the plug and I think they are Solaris only, now. [There was a rumor they were converting to Win.] The problem is the price tag is astronomical. And, if you have poor diction, you have just wasted a lot of money. Several companies have built innovative systems around Hark. However, if you have a certain American-dialect, it will absolutely, positively not recognize a word you say. [Okay: less than 25% recognition rate was what we achieved.] It will recognize heavily-accented German and heavily-accented Malay; however ...


    All in all, Voice Recognition is reasonably good but it is not where it should be, yet.


  9. Two great choices by Anonymous Coward · · Score: 0


    iDictate for Mac OS X (MacSpeech, Inc.)

    ViaVoice for Mac OS X (IBM)

  10. Mandrake by leviramsey · · Score: 1

    I seem to remember that commercial versions of Mandrake up to 8.0 had IBM ViaVoice... I haven't bought a full commercial edition of 8.1, so I don't know if they still do that, though.

  11. another speech package by jiminim · · Score: 1

    ISIP has a pretty good speech to text system that should work on most Linux/Unix boxes.

    Takes a little intelligence to set up though.