Slashdot Mirror


Open Source Transcription Software?

sshirley writes "I am beginning to do some interviews with family members and will do some audio journals for genealogy purposes. I would really love to be able to run the resulting MP3 or WAV files through some software a get a text file out. I know that software like this exists commercially. But does this exist in the open source world?"

4 of 221 comments (clear)

  1. Unfortunately... by dmneoblade · · Score: 3, Interesting

    I spent several month searching for something like this. Open-source voice recognition is in really infant stages, and there does not seem to be much interested in improving the few things we have.

    --
    Warning, knife is sharp. Please keep out of children.
  2. Re:Got kids? by Luckyo · · Score: 4, Interesting

    This is one of the cases where journey matters as much if not more then destination :)

  3. Re:Dear aunt, by BitZtream · · Score: 4, Interesting

    Ironically, I have a family member he runs a business doing transcription for doctors ... because every time the try voice recognition software they get pissed off and go back to real people.

    Being a fan of Dragon Dictate myself, I know its not that great and I know it has a fit when you start throwing accents at it, training or not.

    I call bullshit on your claims of using Dragon for everything.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  4. Re:CMU Sphinx by inkyblue2 · · Score: 3, Interesting

    Sphinx by itself is a terrible answer to this problem, unfortunately. The code is free, but good luck finding an appropriate model. Worse, you'll need to train a speaker-dependent model to get any usable results, and this is a VERY non-trivial task with Sphinx tools in the state that they are. I spent several years getting paid to adapt Sphinx for commercial purposes and while it's great for some things, I can say with confidence that it is not the tool you're looking for.

    You know what works? Dragon. Hate to say it, but the commercial products here have a gigantic edge on the competition.

    That said, I'd love to see someone come up with an open source speaker-dependent model training system that's friendly enough for app developers (not speech researchers) to roll into projects. I think this is a big open door for contribution to the community. Sphinx isn't the best thing going, but it's certainly usable, and if a real product came into being I'm sure all the speech wonks would start coming out of the woodwork to improve the algorithms.