Mozilla's New Open Source Voice-Recognition Project Wants Your Voice (mashable.com)
An anonymous reader quotes Mashable:
Mozilla is building a massive repository of voice recordings for the voice apps of the future -- and it wants you to add yours to the collection. The organization behind the Firefox browser is launching Common Voice, a project to crowdsource audio samples from the public. The goal is to collect about 10,000 hours of audio in various accents and make it publicly available for everyone... Mozilla hopes to hand over the public dataset to independent developers so they can harness the crowdsourced audio to build the next generation of voice-powered apps and speech-to-text programs... You can also help train the speech-to-text capabilities by validating the recordings already submitted to the project. Just listen to a short clip, and report back if text on the screen matches what you heard... Mozilla says it aims is to expand the tech beyond just a standard voice recognition experience, including multiple accents, demographics and eventually languages for more accessible programs.
Past open source voice-recognition projects have included Sphinx 4 and VoxForge, but unfortunately most of today's systems are still "locked up behind proprietary code at various companies, such as Amazon, Apple, and Microsoft."
Sounds good if they make the corpus freely available. Having lots of free high quality audio recorded from modern digital microphones would be useful. Voxforge recordings tend to be poor quality, TIMIT is still proprietary despite being over 30 years old now, and the TEDLIUM corpus recordings seem to have a horrible amount of reverb/echo in them.
"I bless every day that I continue to live, for every day is pure profit."
If there was a "+1 Tried really hard to sound insightful" I'd mod you up.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Thanks to Nuance voice recognition industry is effectively dead. If Mozilla can make this work in offline mode it would be awesome. Not requiring your every word to be recorded shipped off to third parties would be very useful.
At one extreme, TiESR https://gforge.ti.com/gf/proje... is a fairly simple to use. Not state of the art, but it does use Hidden Markov Models (HMM's) and has some noise compensation built in. It comes with word and language models, so it's fairly easy to use - for US English at least. I haven't been ambitious enough to figure out how to build new models.
At the other extreme, Kaldi http://kaldi-asr.org/ is the most advanced open source recognizer that I'm aware of. Neural Nets and all the other goodies researchers have been working on the last few years. Definitely not easy to compile or use, though. And don't even think about trying to design a neural net without a graphics card to use as a math accelerator: one of the examples ran for days and wasn't even close to finishing when I gave up.
Anybody else have suggestions for another toolkit?
I'd say everything for the future. Google Home and Alexa are the new web browsers. The web browser is growing beyond its traditional interface to become a full-blown virtual secretary. It is getting to the point where it drives me nuts that I keep having to go to the keyboard when using the browser on my PC instead of just ask like I do with the assistant on my phone.
And I for one always saw this day as one in which the assistant would be running on my machine, not on some cloud server. The implications of having this extension of my mind which will likely in short order reach the ability to act as an autonomous proxy representation of my desires / will residing outside of my home are huge.
Any effort to get assistant level AI to run on local resources from open source is a good effort.
In soviet Russia the domain extension autopilots you.
This perpetual motion machine Lisa made is a joke, it just keeps getting faster and faster. - Homer
I didn't mod, but rather commented instead as I said. Allow me to laugh my ass off for a while that you are actually so stupid you couldn't figure that out.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
I don't think he's a troll, there's a point to be extracted from that.
I love Mozilla because of how much they've done for the web, from fighting for standardisation, HTML5, JavaScript, and building up one of the most complex applications around, to fighting a little for users' privacy, etc, but they deserve all the abuse they get for getting rid of the most natural leader (creator of JavaScript, no less, from the early days of Netscape) - and yes, it well and truly was a witch-hunt against him.
Without him as a leader, these days, it appears to most that Mozilla is just following in the wake Google's Chrome and copying everything from the outward design, to the extensions/addons system, etc.
Parent should be modded-up.