Recorded Speech to Text Software?
shfted! asks: "Recently, I've been given the task of transcribing several dozen audio tapes of interviews to typed word, that is, listening for 10 seconds, write what was said, repeat. At around 4 hours per hour long tape, I would like to automate the process somehow. Recording the tape into the computer is no problem, but I need some software that will do the speech recognition accurately more than quickly -- several hours per tape is not an issue (I have access to several machines running 24/7). I will still have to go over the computer's work to correct any mistakes. A free solution for Linux would be best, non-free and Windows solutions are okay, but a working solution is highest priority. Can anyone point me in the right direction(s)?"
The people who do lots of these (such as transcription services for doctors) use India. You need educated people by the way, they have to know the words being used.
Years ago, I improved my own typing speed and accuracy by transcribing phone conversations with friends. It just takes some practice.
Of course, if you are listening to this guy, you can disregard my advice.
Give Sphinx a try. It's pretty accurate; especially Sphinx-3. I've used v2 before for a live test, and it works great -- even with different voices.
Basicly, rather then typing in characters to form words, they are typing in syllables to form words. Sometime later they transcribe the shorthand into full text. So while recording speech in real time, they are not transcribing it into full text.
And somewhere back in my brain ISTR that prety much all US court procedings have been recorded on audio tape for decades. I know for a fact that the local court houses (Halifax, Nova Scotia, Canada) have over the last decade or so invested huge amounts in real time, computer based, audio recording gear. So, in addition to having the shorthand version, when transcribing into full text, the reporter would have the ability to listen to it again.
If you can get a copy of SuSE 7.3 Professional, it comes with IBM's ViaVoice for Linux. It can take audio and turn it into text. The trick is that 7.3 came out about 2 years ago, I think. Most stores would have the newer 9.0 version, which doesn't have ViaVoice.
I guess it is possible that IBM still sells ViaVoice for newer distros. I've never looked.
My Greasemonkey scripts for Digg &
A friend was in a similar situation -- she had recorded a phone interview [1], and needed to transcribe it. To make certain there were no technical glitches, the interview was recorded to cassette and as a WAV file on her PC.
When the time came to transcribe the interview, she found the version on her PC more helpful -- her hands never had to leave the keyboard in order to pause or "rewind" the audio.
If you go this route, remember that you'll need about 600 MB per hour of uncompressed audio. If space is an issue and you need to compress, don't max out the compression; saving a few megabytes here and there could result in hours of extra work due to artifacts.
[1] With explicit permission given.
http://download.com.com/3000-7239-10251419.html?ta g=lst-0-12