Open Source Transcription Software?
sshirley writes "I am beginning to do some interviews with family members and will do some audio journals for genealogy purposes. I would really love to be able to run the resulting MP3 or WAV files through some software a get a text file out. I know that software like this exists commercially. But does this exist in the open source world?"
I spent several month searching for something like this. Open-source voice recognition is in really infant stages, and there does not seem to be much interested in improving the few things we have.
Warning, knife is sharp. Please keep out of children.
This is one of the cases where journey matters as much if not more then destination :)
Ironically, I have a family member he runs a business doing transcription for doctors ... because every time the try voice recognition software they get pissed off and go back to real people.
Being a fan of Dragon Dictate myself, I know its not that great and I know it has a fit when you start throwing accents at it, training or not.
I call bullshit on your claims of using Dragon for everything.
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
Sphinx by itself is a terrible answer to this problem, unfortunately. The code is free, but good luck finding an appropriate model. Worse, you'll need to train a speaker-dependent model to get any usable results, and this is a VERY non-trivial task with Sphinx tools in the state that they are. I spent several years getting paid to adapt Sphinx for commercial purposes and while it's great for some things, I can say with confidence that it is not the tool you're looking for.
You know what works? Dragon. Hate to say it, but the commercial products here have a gigantic edge on the competition.
That said, I'd love to see someone come up with an open source speaker-dependent model training system that's friendly enough for app developers (not speech researchers) to roll into projects. I think this is a big open door for contribution to the community. Sphinx isn't the best thing going, but it's certainly usable, and if a real product came into being I'm sure all the speech wonks would start coming out of the woodwork to improve the algorithms.