Good Cross-Platform Speech-Recognition Programs?
CryoStasis writes "I am a graduate student getting my degree in biomedical sciences. Because my work often requires me to maintain a local sterile environment (under a biological hood) I find that I am unable to physically touch my computer, which sits beside me, in order to open my notes, protocols, etc. while I'm working. As a result, I have begun to search for a voice-recognition program that will allow me to tell the computer what files/programs to launch. I know that the general field of voice recognition has come a long way, but I find that the built-in speech recognition systems in both OS X and Vista are clunky and difficult to use. Are there any good, cross-platform speech-recognition programs available that might fit the bill?"
Dragon Naturally Speaking is as close as it gets. And it's only really good for basically writing down your voice, it's not really that good for controlling your computer. I believe it works in both Vista and OS X.
There used to be ViaVoice that also worked in Linux IIRC - but it basically stopped working on it circa 2001/2002.
Perhaps another input device is called for, because voice recognition right now will only frustrate you more than anything for what you want to use it for.
BTW, I believe OS X has voice recognition built in you may want to check out for controlling your computer - but it's been ages since I used it. It's actually geared toward controlling your computer, and not to replace typing.
Kaiser MDs use Dragon.
We use mainly OSX macs in the lab, but if possible I would also like to install the program on some of our other Vista machines for hands free use.
I used Sphinx4 in my final year project at uni. It's free and Java based, with open source code so is fully customisable to those who want to spend a little effort doing so. As it is written in Java, it works on any operating system with a Java Runtime Environment.
In the process of finding Sphinx4 I spent a lot of time trying other multi-platform software, but due to its open source nature found this to be the best (that actually worked).
For most of the work I do, that's not entirely correct. I work with a laminar flow hood similar to this one. You may have cells growing in an incubator in a sterile dish. You have to take out those cells and manipulate them some way and then keep them growing, while not contaminating the culture. The simplest thing to do is to spray down the hood with ethanol and spray anything that goes into the hood with ethanol as well. Any liquids you use needs to be passed through a sterile filter to remove any contaminating organisms. The problem is that doing all of this with a computer nearby is awkward. You sit there with dozens of tubes inside the hood, all sorts of liquids and measuring equipment outside the hood, and you have to carefully add or remove a precise amount of specific entity to the culture. The simplest way to do this is to take a piece of paper that tells you what to do and tape it to the glass. Unless you're working with an organism that can infect people, you don't need to destroy the paper afterwards because you're not trying to keep the paper sterile; you're trying to keep your tissue culture dish and the cells inside sterile.
Period, end of report. In the PC world there essentially is no other general purpose voice interface tech that is even worth bothering with.
That being said, there are much better ones for very specific vertical markets, but not for general use.
Note that this means you ARE restricted to Windows. The stuff built into OSX and Vista are not even worth messing around with. They might in theory meet some very casual or narrow specific need of particular users but they are literally an order of magnitude slower and less reliable than Naturally Speaking.
If you MUST use a Mac or Linux etc then basically the answer is, you're SOL, there's nothing. Yeah, there are a few OSS bits out there, but frankly they aren't even at the level of being really functional software, let alone meeting speed or accuracy required from this type of software. It would be AWESOME if there was something open, but the fact is this area is just so technically demanding it appears to be beyond the reach of non-commercial effort.
"Malo periculosam, libertatem quam quietam servitutem." -- Jefferson