Good Cross-Platform Speech-Recognition Programs?

← Back to Stories (view on slashdot.org)

Good Cross-Platform Speech-Recognition Programs?

Posted by timothy on Saturday November 8, 2008 @08:30AM from the get-back-in-your-hood-worm dept.

CryoStasis writes "I am a graduate student getting my degree in biomedical sciences. Because my work often requires me to maintain a local sterile environment (under a biological hood) I find that I am unable to physically touch my computer, which sits beside me, in order to open my notes, protocols, etc. while I'm working. As a result, I have begun to search for a voice-recognition program that will allow me to tell the computer what files/programs to launch. I know that the general field of voice recognition has come a long way, but I find that the built-in speech recognition systems in both OS X and Vista are clunky and difficult to use. Are there any good, cross-platform speech-recognition programs available that might fit the bill?"

14 of 175 comments (clear)

Use PocketSphinx by Anonymous Coward · 2008-11-08 08:39 · Score: 5, Interesting

We have pocketshinx working on windows, mac and linux in FreeSWITCH. http://www.freeswitch.org/ /b
1. Re:Use PocketSphinx by Anonymous Coward · 2008-11-08 11:21 · Score: 4, Informative
  
  I used Sphinx4 in my final year project at uni. It's free and Java based, with open source code so is fully customisable to those who want to spend a little effort doing so. As it is written in Java, it works on any operating system with a Java Runtime Environment.
  In the process of finding Sphinx4 I spent a lot of time trying other multi-platform software, but due to its open source nature found this to be the best (that actually worked).
Nope, there isn't. by Anonymous Coward · 2008-11-08 08:42 · Score: 5, Informative

Dragon Naturally Speaking is as close as it gets. And it's only really good for basically writing down your voice, it's not really that good for controlling your computer. I believe it works in both Vista and OS X.
There used to be ViaVoice that also worked in Linux IIRC - but it basically stopped working on it circa 2001/2002.
Perhaps another input device is called for, because voice recognition right now will only frustrate you more than anything for what you want to use it for.
BTW, I believe OS X has voice recognition built in you may want to check out for controlling your computer - but it's been ages since I used it. It's actually geared toward controlling your computer, and not to replace typing.
1. Re:Nope, there isn't. by Anonymous Coward · 2008-11-08 09:42 · Score: 5, Interesting
  
  I, for one, read the summary, but would like to contradict it.
  I got RSI and finished a 100-page document using Vista voice recognition only. Just train it properly with a good mike and it's perfectly ok. Apart from dictation, you can say a word in any link or button in properly coded apps, and spell stuff out using the radio alphabet. Alternatively, you can use the commands "mousegrid" and "show numbers" to move the mouse directly or label every control with numbers, respectively.
  Oh, and if you get RSI, don't even think about trying to configure anything in Linux until you recover. Ditch it for Vista on day 1. Your hands and sanity will thank you.
Paper by DebateG · 2008-11-08 08:42 · Score: 5, Insightful

I work in a biological lab and have a similar problem. I find that paper is much simpler for most things. I have a notebook containing only printouts of protocols with little tabs denoting where each one is. I remove whatever protocol I'm using and carry it over to wherever I'm working. Anything else I need from my notes, I write on paper and carry. Yes, it's a bit wasteful, but I've found that in the preparation of gathering all the relevant pieces of paper, it really forces you to adequately prepare for an experiment instead of trying to figure it out on the fly.
1. Re:Paper by girlintraining · 2008-11-08 09:00 · Score: 5, Funny
  
  I can't say I've ever been in a biolab, but the idea of someone working in one, with their hands in a sealed box manipulating god-only-knows-what... and then trying to talk/use a computer at the same time give me the hebejebees. I can think of at least four hollywood horror movies that started with similar premises. Sometimes a simple low-tech solution really is the best... and it saves on zombie attacks.
  
  --
  #fuckbeta #iamslashdot #dicemustdie
2. Re:Paper by girlintraining · 2008-11-08 09:25 · Score: 4, Funny
  
  Yes, but like most geeks... he wants to do everything himself. God forbid a man ask for help...
  
  --
  #fuckbeta #iamslashdot #dicemustdie
depends... by girlintraining · 2008-11-08 08:46 · Score: 4, Insightful

Yes, software exists. But most likely unless the program only performs simple operations with dialog boxes and can function with only limited keyboard input, you will probably find it inadequate or klunky, even if the speech recognition is perfect (it never is). Instead of asking whether speech synthesis software is right for you, the better question would be is your software a good fit for speech synthesis?

--
#fuckbeta #iamslashdot #dicemustdie
Alternatives by ustolemyname · 2008-11-08 09:00 · Score: 5, Insightful

Wireless keyboard much?
Three words by ceoyoyo · 2008-11-08 09:09 · Score: 5, Funny

Cute summer student.
Try a Laser Keyboard by gambolt · 2008-11-08 09:36 · Score: 4, Interesting

they are awkward but pretty cool. It's a virtual keyboard projected onto a flat surface which could be sterile. There's zero tactile feedback but you can use it for simple stuff.
Example
http://www.virtual-laser-keyboard.com/
More on Dragon Systems by bdwoolman · 2008-11-08 10:02 · Score: 4, Insightful

Dragon Systems is by far the best speech to text resource. I use 9.0, but 10.0 is out. And by all accounts it is better. Like all good tools that have power and flexibility Dragon takes some time to master. But it is intelligent and repays hard work by improving. Suggest you get Dragon Preferred or, at a minimum, Pro. With these you can also make audio notes on a stand-alone recorder which may be fed in to the program later for transcription. If the audio is good (use a headset) the results are very good. Of course it needs an editing treatment, but what draft does not? So, you could make notes in addition to controlling the computer.
I suggest you practice at some time when your hands are not busy playing with the Andromeda Strain. And if you get skilled with Dragon you can swap modes; that is, speech to text or control mode.
The hard truth is this: Speech to text is something you have to learn how to do. Even if the program is perfect there is a learning curve for verbally inserting punctuation. And for writing with your voice. Nine has a feature to do punctuation automatically, but it works as poorly as most stenographers. In another life I used to dictate to a secretary who took shorthand. Even with her I interposed punctuation. And I can tell you...It really took me some time to learn her curves. Drum Roll Please

--
"No fear. No envy. No meanness." Liam Clancy
Wrong question by Vornzog · 2008-11-08 12:08 · Score: 4, Insightful

You don't want voice recognition. You want basic planning and lab book management skills.
You should be asking "Why didn't I get all of my protocols, reagents, samples, and equipment set up before I started my experiments for the day?"
I did quite a bit of biochemical benchwork to get my PhD, involving flu. Touching almost anything was either a bad idea for your health, or a worse idea for your experiment.
Instead, you laid out a plan for what experiments you were going to do for the day. You wrote it up in your notebook before you started. If you were doing a standard experiment, you probably had an easy excel template where you typed in the number of replicate experiments you wanted to run, and it did all of your calculations for you. Print it out, tape it in your notebook, grab all your samples and reagents from the freezer, and then (and *only* then) did you put on your gloves and go into the sterile hood.
My old lab book is *full* of these little protocols, usually with a typed note at the bottom about which samples I wanted to run, and a few hand written notes from after I took my gloves off.
For long, complex protocols, lay out a protocol book with step by step instructions. For really sensitive experiments, don't be afraid to change gloves after you flip the page. Gloves are cheap, compared to the reagents needed to run even a single PCR reaction.
A good craftsman has laid out all of his tools, plans and materials before he starts work. Good chefs have all their ingredients measured and utensils easily accessible before they start cooking. Either one *could* use a computer to track their project. But they don't, because it just makes everything more complicated.
Use a computer for planning, data storage, analysis, etc. Once you put the gloves on, good notebook skills put the computer to shame every time.

--
-V-

Who can decide a priori? Nobody.
-Sartre
Good speech recognition doesn't exist anywhere by Theovon · 2008-11-08 14:46 · Score: 4, Interesting

I'm a grad student in computer science, specializing in AI. Although it is not my forte, I have studied speech recognition a fair amount, and I am friends with professors and grad students who are on the cutting edge of ASR.
Unfortunately, the real answer is that, at least by my standards, there is no good speech recognition anywhere.
One of the most challenging things about human speech is what we call "lack of invariance". The same word can be said by the same person two times in a row, within exactly the same context, and the signals will differ to an amazing degree.
At this point, if you have a hand-segmented accoustic signal, where the phone boundaries (such that there are any) are already marked, we have recognition rates exceeding 90%. But if the signal is not already marked, where the ASR machine has to segment automatically, the rate goes down dramatically. Then you have to recognize words, where the realization of any given word in any give context is not necessarily consistent with how you would typically describe the word phonemically. We see it all the time where what's in the accoustic signal is actually quite different from what the listener hears. It's really quite frustrating.
In my opinion, the accuracy of even cutting edge speech recognition software is pretty miserable.