Good Cross-Platform Speech-Recognition Programs?
CryoStasis writes "I am a graduate student getting my degree in biomedical sciences. Because my work often requires me to maintain a local sterile environment (under a biological hood) I find that I am unable to physically touch my computer, which sits beside me, in order to open my notes, protocols, etc. while I'm working. As a result, I have begun to search for a voice-recognition program that will allow me to tell the computer what files/programs to launch. I know that the general field of voice recognition has come a long way, but I find that the built-in speech recognition systems in both OS X and Vista are clunky and difficult to use. Are there any good, cross-platform speech-recognition programs available that might fit the bill?"
We have pocketshinx working on windows, mac and linux in FreeSWITCH. http://www.freeswitch.org/ /b
Dragon Naturally Speaking is as close as it gets. And it's only really good for basically writing down your voice, it's not really that good for controlling your computer. I believe it works in both Vista and OS X.
There used to be ViaVoice that also worked in Linux IIRC - but it basically stopped working on it circa 2001/2002.
Perhaps another input device is called for, because voice recognition right now will only frustrate you more than anything for what you want to use it for.
BTW, I believe OS X has voice recognition built in you may want to check out for controlling your computer - but it's been ages since I used it. It's actually geared toward controlling your computer, and not to replace typing.
I work in a biological lab and have a similar problem. I find that paper is much simpler for most things. I have a notebook containing only printouts of protocols with little tabs denoting where each one is. I remove whatever protocol I'm using and carry it over to wherever I'm working. Anything else I need from my notes, I write on paper and carry. Yes, it's a bit wasteful, but I've found that in the preparation of gathering all the relevant pieces of paper, it really forces you to adequately prepare for an experiment instead of trying to figure it out on the fly.
You could always use Vista's speech recognition.
Here's a Video.
Yes, software exists. But most likely unless the program only performs simple operations with dialog boxes and can function with only limited keyboard input, you will probably find it inadequate or klunky, even if the speech recognition is perfect (it never is). Instead of asking whether speech synthesis software is right for you, the better question would be is your software a good fit for speech synthesis?
#fuckbeta #iamslashdot #dicemustdie
Kaiser MDs use Dragon.
I'm thinking you're only using one computer for most of your work anyway.
How important is cross platform - or is that just what the cool kids say these days?
Wireless keyboard much?
Cute summer student.
The current state of voice control is, unfortunately, rather clunky. On the plus side, there are slightly nonstandard peripherals that might do the job instead.
For some years now, there have been pointing devices for the disabled that essentially involve an IR webcam and a reflector or LED stuck to whatever part of the body the user can still move. http://www.naturalpoint.com/ make some such, I suspect that they also have competitors. On the cheap side, there has been a fair bit of buzz lately about using video processing software with ordinary webcams. A bit of googleing should turn up stuff for Win, Mac, and Linux.
On the keyboard side, silicone rubber flexible keyboards have proliferated alarmingly of late. The keyfeel is bloody awful; but they are cheap, fully sealed against moisture, and can survive cleaning with various moderately horrible solvents.
With a simple USB hub, you should be able to leave the keyboard and webcam in the hood, never having to touch the webcam, and dousing the keyboard in whatever horrible substances are necessary to keep it sterile, and just plug in the one USB cable to your laptop before you begin work. Not wildly elegant; but it should provide you with a standard keyboard and pointing device that fulfill your requirements.
There is no substitute for teamwork. I don't work in a biologically clean environment, but I do sometimes work in a vacuum clean environment which requires that I avoid touching anything that isn't cleaned to go into a UHV chamber. Having a teammate to work in the "dirty" environment in the rest of the lab makes things much, much easier.
The progress of research is never perfectly predicable, and you're always going to find some surprise which needs immediate attention. Having another person there means you don't have to prepare in advance every possible command you may need a computer to run, plus a person can do things like answer the phone and sign for deliveries. It's also good practice for later in your scientific career when you'll have to train and trust your own students/interns/employees.
Kind of a clunky idea, but here goes.
Get a numeric keypad, and pop off every other button cap. Map the remaining keys to whatever actions you want to control on the computer. Tape the keypad to the window on your hood, perhaps with blue masking tape (removes cleanly). Hit the buttons with your nose.
On Windows, I would get all the files opened, and have a key for Alt-Tab, and then keys for left, right, up and down.
Good Luck!
Autoclave - will leave just a pile of melted plastic in place of kb+mouse.
Gamma rays - not sure of dose, but may play havoc with the electronics inside.
Ethylene oxide - yes, but how common is that? I used to work in a lab in a big university/major city and we didn't have ethylene oxide facilities. Only autoclaves.
I would suggest: seal kb and/or mouse in a plastic pouch, and use a chemical method to sterilize the outside of the pouch (bleach, etc). To change batteries, cut pouch open, put new batteries in, place in new pouch and repeat.
I hadn't known there were so many idiots in the world until I started using the Internet -Stanislaw Lem
So you're a grad student in the sciences and write "build in" instead of "built-in".
Don't rag on him, it was his software. He originally said "included."
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
You can get a mouse that you can operate with your feet. Would that work?
they are awkward but pretty cool. It's a virtual keyboard projected onto a flat surface which could be sterile. There's zero tactile feedback but you can use it for simple stuff.
Example
http://www.virtual-laser-keyboard.com/
Shame you're sitting unseen. There are foot controls for the simple stuff he's asking for. Now if he wants to do something more complex then the voice option is the viable one.
Shai Schticks:"You don't make peace with friends, you make peace with enemies"
Dragon Systems is by far the best speech to text resource. I use 9.0, but 10.0 is out. And by all accounts it is better. Like all good tools that have power and flexibility Dragon takes some time to master. But it is intelligent and repays hard work by improving. Suggest you get Dragon Preferred or, at a minimum, Pro. With these you can also make audio notes on a stand-alone recorder which may be fed in to the program later for transcription. If the audio is good (use a headset) the results are very good. Of course it needs an editing treatment, but what draft does not? So, you could make notes in addition to controlling the computer.
I suggest you practice at some time when your hands are not busy playing with the Andromeda Strain. And if you get skilled with Dragon you can swap modes; that is, speech to text or control mode.
The hard truth is this: Speech to text is something you have to learn how to do. Even if the program is perfect there is a learning curve for verbally inserting punctuation. And for writing with your voice. Nine has a feature to do punctuation automatically, but it works as poorly as most stenographers. In another life I used to dictate to a secretary who took shorthand. Even with her I interposed punctuation. And I can tell you...It really took me some time to learn her curves. Drum Roll Please
"No fear. No envy. No meanness." Liam Clancy
I work in healthcare, and know a man paralyzed from the neck down who uses dragonspeak to do everything on his computer.
He has a laptop, and needs someone to turn his computer off and on. But, seems to do pretty well from there, at least for searching the internet. He also buys and trades stocks with it
He had to hire an expert to customize his laptop. So, while it's currently possible to do, it's probably not something that you can do easily.
Is it cross platform? Know idea. He uses windows xp.
You don't want voice recognition. You want basic planning and lab book management skills.
You should be asking "Why didn't I get all of my protocols, reagents, samples, and equipment set up before I started my experiments for the day?"
I did quite a bit of biochemical benchwork to get my PhD, involving flu. Touching almost anything was either a bad idea for your health, or a worse idea for your experiment.
Instead, you laid out a plan for what experiments you were going to do for the day. You wrote it up in your notebook before you started. If you were doing a standard experiment, you probably had an easy excel template where you typed in the number of replicate experiments you wanted to run, and it did all of your calculations for you. Print it out, tape it in your notebook, grab all your samples and reagents from the freezer, and then (and *only* then) did you put on your gloves and go into the sterile hood.
My old lab book is *full* of these little protocols, usually with a typed note at the bottom about which samples I wanted to run, and a few hand written notes from after I took my gloves off.
For long, complex protocols, lay out a protocol book with step by step instructions. For really sensitive experiments, don't be afraid to change gloves after you flip the page. Gloves are cheap, compared to the reagents needed to run even a single PCR reaction.
A good craftsman has laid out all of his tools, plans and materials before he starts work. Good chefs have all their ingredients measured and utensils easily accessible before they start cooking. Either one *could* use a computer to track their project. But they don't, because it just makes everything more complicated.
Use a computer for planning, data storage, analysis, etc. Once you put the gloves on, good notebook skills put the computer to shame every time.
-V-
Who can decide a priori? Nobody.
-Sartre
I'm a grad student in computer science, specializing in AI. Although it is not my forte, I have studied speech recognition a fair amount, and I am friends with professors and grad students who are on the cutting edge of ASR.
Unfortunately, the real answer is that, at least by my standards, there is no good speech recognition anywhere.
One of the most challenging things about human speech is what we call "lack of invariance". The same word can be said by the same person two times in a row, within exactly the same context, and the signals will differ to an amazing degree.
At this point, if you have a hand-segmented accoustic signal, where the phone boundaries (such that there are any) are already marked, we have recognition rates exceeding 90%. But if the signal is not already marked, where the ASR machine has to segment automatically, the rate goes down dramatically. Then you have to recognize words, where the realization of any given word in any give context is not necessarily consistent with how you would typically describe the word phonemically. We see it all the time where what's in the accoustic signal is actually quite different from what the listener hears. It's really quite frustrating.
In my opinion, the accuracy of even cutting edge speech recognition software is pretty miserable.
You: "Wow. This virus interferes with T-Cells, even reanimating dead tissue. That's really wild!"
Computer: "Command accepted. Releasing virus into the wild."
Period, end of report. In the PC world there essentially is no other general purpose voice interface tech that is even worth bothering with.
That being said, there are much better ones for very specific vertical markets, but not for general use.
Note that this means you ARE restricted to Windows. The stuff built into OSX and Vista are not even worth messing around with. They might in theory meet some very casual or narrow specific need of particular users but they are literally an order of magnitude slower and less reliable than Naturally Speaking.
If you MUST use a Mac or Linux etc then basically the answer is, you're SOL, there's nothing. Yeah, there are a few OSS bits out there, but frankly they aren't even at the level of being really functional software, let alone meeting speed or accuracy required from this type of software. It would be AWESOME if there was something open, but the fact is this area is just so technically demanding it appears to be beyond the reach of non-commercial effort.
"Malo periculosam, libertatem quam quietam servitutem." -- Jefferson