Slashdot Mirror


Mozilla Releases Open Source Speech Recognition Model, Massive Voice Dataset (mozilla.org)

Mozilla's VP of Technology Strategy, Sean White, writes: I'm excited to announce the initial release of Mozilla's open source speech recognition model that has an accuracy approaching what humans can perceive when listening to the same recordings... There are only a few commercial quality speech recognition services available, dominated by a small number of large companies. This reduces user choice and available features for startups, researchers or even larger companies that want to speech-enable their products and services. This is why we started DeepSpeech as an open source project.

Together with a community of likeminded developers, companies and researchers, we have applied sophisticated machine learning techniques and a variety of innovations to build a speech-to-text engine that has a word error rate of just 6.5% on LibriSpeech's test-clean dataset. vIn our initial release today, we have included pre-built packages for Python, NodeJS and a command-line binary that developers can use right away to experiment with speech recognition.

The announcement also touts the release of nearly 400,000 recordings -- downloadable by anyone -- as the first offering from Project Common Voice, "the world's second largest publicly available voice dataset." It launched in July "to make it easy for people to donate their voices to a publicly available database, and in doing so build a voice dataset that everyone can use to train new voice-enabled applications." And while they've started with English-language recordings, "we are working hard to ensure that Common Voice will support voice donations in multiple languages beginning in the first half of 2018."

"We at Mozilla believe technology should be open and accessible to all, and that includes voice... As the web expands beyond the 2D page, into the myriad ways where we connect to the Internet through new means like VR, AR, Speech, and languages, we'll continue our mission to ensure the Internet is a global public resource, open and accessible to all."

58 comments

  1. This changes the game entirely! by Anonymous Coward · · Score: 2, Funny

    I had a problem with voice assistants because it was not being done inside my circle of trust (but closer to a sworn enemy).

    If I can run this on my home server, it will completely change the game.

    I could now actually implement Star-Trek-style home automation, if I needed it.
    "Computer, switch mood to" evening'." (campfire color scheme lights, shutters down, adjust screen warmth, play some relaxing music, mix me some nice drink [I'm building a drink mixer] and connect me to a tight slut)

    1. Re:This changes the game entirely! by Anonymous Coward · · Score: 0

      You sound like a class act.

    2. Re:This changes the game entirely! by Aighearach · · Score: 2

      My first question, is 6.5% error rate considered good? Surely it has uses, but it seems to be good for "things that aren't important enough to write down right now, but I want a half-ass transcription just in case."

      Doesn't seem to be quite what an assistance would be writing down, one would hope.

    3. Re: This changes the game entirely! by Anonymous Coward · · Score: 0

      Just like you can run an entirely local Firefox account server? Oh, wait.

    4. Re:This changes the game entirely! by Anonymous Coward · · Score: 0

      6.5% doesn't sound particularly good to me. I am a regular Dragon user, and on my own speech it certainly does better than that. In this message Dragon gave me 100% recognition.

    5. Re:This changes the game entirely! by roca · · Score: 2

      I believe this is a speaker-independent test so you shouldn't expect results to be as good as a system that has learned your speech.

      In the article they say that humans have about a 6% error rate on the same test.

    6. Re:This changes the game entirely! by Anonymous Coward · · Score: 1

      I know this is hard for you to understand, but some people like knowing that their actions and communications aren't being recorded or analyzed by anyone. The only way to guarantee this is by hosting your services on your own hardware.

      Hello? This is a geek website. Well, it used to be.

    7. Re:This changes the game entirely! by Anonymous Coward · · Score: 0

      "Tight slut" is an oxymoron.

    8. Re:This changes the game entirely! by thinkwaitfast · · Score: 1

      99% of people already have this.

      I am the 1%.

    9. Re:This changes the game entirely! by slashrio · · Score: 1

      Tight?
      Sounds like your willy isn't that large...

      --
      "Trump!!", the new Godwin.
    10. Re:This changes the game entirely! by Anonymous Coward · · Score: 1

      Error rates for speak recognition varies greatly. 6.5% is very poor is it was trained explicitly on your voice, it is stellar performance if you are to interpret a random person with foreign accent.

      While the article is sparse on details, the most common benchmark for speech recognition accuracy is the "Switchboard conversational speech recognition benchmark". I am assume they are referring to this test. The Switchboard benchmark test is made up of 40 phone conversations from native english speaking persons. In this test the estimated human error rate is 5-6%.

      Assuming they are referring to a Switchboard rating, 6.5% is considered very good by todays standard. However the improvements in this area over the last years have been tremendous. Here is a graph showing the best performing speech recognisers over the last five years: https://awni.github.io/images/speech-recognition/wer.svg

      There are a lot of techniques that can be used for improving speech recognition. To do research in this area you will however both need a very large transcribed dataset of voice recordings, and you should have a state of the art system to build on top of. This release addresses both concerns. Making it open source is a fantastic boost for research in the area.

  2. The ADD of Mozilla continues by Anonymous Coward · · Score: 0, Insightful

    I guess with Firefox OS, Thunderbird, etc now dead and buried Mozilla needs something else to do instead of working on Firefox. I mean with all the time they've saved by transforming Firefox into Chrome those 1200 people need something to say that they're working on.

    1. Re:The ADD of Mozilla continues by ShanghaiBill · · Score: 1

      Mozilla needs something else to do instead of working on Firefox.

      At least they are no longer spending donor dollars on sponsoring surfing contests.

    2. Re:The ADD of Mozilla continues by roca · · Score: 4, Insightful

      Actually Web browsers need to implement a standardized speech recognition API (WebSpeech --- https://developer.mozilla.org/...), so this work could and probably will become part of Firefox. We wouldn't want speech-dependent Web applications to suck in Firefox on Linux because Firefox doesn't have access to a quality recognizer on free operating systems, would we?

      This sort of thing is why building and maintaining Firefox is tremendously expensive. http://robert.ocallahan.org/20...

    3. Re:The ADD of Mozilla continues by omnichad · · Score: 1

      While I agree, the recognition should be weighted by context - so that keywords are recognized more readily. At least until natural language processing progresses by leaps and bounds from where we are now. And if there isn't a way for the server side to provide context hints, you'll want to process the voice server-side anyway.

  3. Common Voice? by Anonymous Coward · · Score: 0

    Is that site black or is it just some javascript bogosity? Who cares?

  4. OK Mozilla by bobstreo · · Score: 1

    Disconnect Alexa, SIRI and Google from the network, and shut them down.

  5. Re: I'm not interested in giving Mozilla money by Anonymous Coward · · Score: 0

    Huh? Their browser is fine â" what do you want improved in it? I would like to see an open source voice assistant run by non-profit organizations. Think of it as a browser feature. One way of improving the browser is voice control.

  6. Re:I'm not interested in giving Mozilla money by roca · · Score: 3, Informative

    The vast majority of Mozilla's money is spent on Firefox.
    You just made up "most of their money is spent on community projects" out of thin air, didn't you?

  7. Re: I'm not interested in giving Mozilla money by Anonymous Coward · · Score: 0

    What kind of cheap hardware are you people running that you can't handle an application using 4 gigs of memory? Throw away that decade old shitbox and drive around on trash day and find something newer.

  8. I would like to buy a hamburger by mspohr · · Score: 1

    They really need to train it on this...
    https://www.youtube.com/watch?...

    --
    I don't read your sig. Why are you reading mine?
    1. Re:I would like to buy a hamburger by Anonymous Coward · · Score: 0

      If you think this is funny, I have news for you. Their model isn't even this good yet.

  9. Re:I'm not interested in giving Mozilla money by Anonymous Coward · · Score: 0

    Not sure how many people tested out the model yet. I did a few days ago. It wasn't so great. Your WAV file has to be in an exact format. Once you get past that, it wasn't able to understand the clip of audio I fed it at all. They have a lot of work to do. Sorry guys, I will be paying Google the 2 cents to do speech to text still.

  10. Technology does not include voice. by Anonymous Coward · · Score: 0

    Mozilla is fucking up lately. Firefox is more and more annoying.

    eg. Why so many OK OK's to install an add-on? Why break old good ones? Why uncheck 5 boxes to get a blank new tab? All dick moves.

    get fukt. I'll go portable old version on all of it pricks.

    1. Re:Technology does not include voice. by higuita · · Score: 3, Insightful

      > Why so many OK OK's to install an add-on?

      Because i want to install add-ons and not let random sites, apps or other add-ons to be able to install add-on silently, just like the old activex in IE

      > Why break old good ones?

      Because old ones could touch and replace ANYTHING in the browser, so it was a huge security problem, performance problem and locked mozilla from making big changes, as it would break many extensions. They finally decided to break everything and define a proper add-on API, that can be stable, run in outside and locked processes and using multiple cpus. They didn't decide to break the add-on just to annoy you, they had very good reasons

      > Why uncheck 5 boxes to get a blank new tab?

      I do like the new start page... but if you do not, then the 5 boxes to disable all the start page features is not hard at all, you just need to do it once. Notice that all that info in local info, what you see It's flexible enough to please most people... and those that really want a empty page, it's there too. There is no default config that will make everyone happy

      --
      Higuita
    2. Re:Technology does not include voice. by Anonymous Coward · · Score: 0

      still a dick move to make this BS default again when the users have told them many times that they don't want their disgusting suggestion tab.

    3. Re:Technology does not include voice. by higuita · · Score: 1

      some users != all users

      Again, it's not easy to make everyone happy

      --
      Higuita
  11. 1st thing that made me WANT to give them $ !!! by fyngyrz · · Score: 4, Insightful

    Things like these are the reason why I'm not donating money to Mozilla.

    If - and I don't yet know if this is the case, they don't actually seem to say - this represents a stand-alone, does-not-go-to-the-LAN-or-WAN speech-to-text system... with an error rate of 6.5% on English speech as claimed... then it's way more important than Yet Another Web Browser.

    This is precisely the kind of thing projects like Mycroft need to become not just another way to send your activity out on the net, which inherently decreases both reliability and security.

    If indeed this is what this is, then the door opens for all manner of sophisticated home advances we can actually trust and depend on.

    They claim around 1:1 [decode rate : normal speech rate] with a reasonably modern CPU/GPU. That needs considerable improvement. Reference quote from here:

    On a MacBook Pro, using the GPU, the model can do inference at a real-time factor of around 0.3x, and around 1.4x on the CPU alone. (A real-time factor of 1x means you can transcribe 1 second of audio in 1 second.)

    That's a lot of computing power to hand off, particularly in a laptop. Using just the CPU, you'll be pegging it the whole time you're talking, and then some. For a decent desktop, it's at least doable, but it's still a very heavy compute load.

    Though... saying "MacBook Pro" doesn't really tell us enough... I have a MacBook Pro that is a dual-core Intel machine... it's not what you'd call quick. There are a lot of different hardware configs that could be described by "MacBook Pro."

    Seems like a pretty big deal to have to dedicate a server to the STT task (but then again, if I could get my STT tasks out from under the cloud... I'd probably do it. I have a spare 3 GHz 8-core hanging around, so...) but I think for general use, they have to do better. This isn't going to fly well on a Raspberry pi, for instance, it'll just get way behind.

    Still. IMHO, this may be important. Very.

    --
    I've fallen off your lawn, and I can't get up.
    1. Re:1st thing that made me WANT to give them $ !!! by HatofPig · · Score: 1

      This isn't going to fly well on a Raspberry pi, for instance, it'll just get way behind.

      Just make it sound like Majel Barrett-Roddenberry and have it go "Working....kjunk kjunk kjunk kjunk... " while it works through the buffered speech data.

      --
      Silicon & Charybdis McLuhan Kildall Papert Kay
    2. Re:1st thing that made me WANT to give them $ !!! by roca · · Score: 1

      I don't think they've put much/any effort into optimizing the recognizer yet.

  12. Re:The vast majority of their money.... by scdeimos · · Score: 1

    They have been neglecting the security aspect of their browser for 20+ years while continually throwing in new projects like this utilizing developers whose caliber of work would not qualify them for jobs in the professional sector.

    20+ years? Really? Mozilla is only 19 years old. Firefox is only 15 years old.

  13. Innovation by Bruce+Perens · · Score: 1

    Certainly this will be part of the browser.

    Obviously, the Mozilla folks see that direct text input devices may not play a big role in our future, or indeed our present, and they don't want to use the Internet services of another browser maker (Apple, Google, Microsoft, Amazon) to enable non-text on their browser. This would be slower than local recognition and not under their control. Or yours.

    It's really easy to stifle innovation by requiring an over-tight focus. Many companies fail by doing just this.

  14. Re:I'm not interested in giving Mozilla money by higuita · · Score: 1

    Thinks like this will make the browser better in the future! They have vision and want to be a leader in future tech solutions.

    That future is not that far away, you have currently siri, alexa and friends growing up, you have mobile phones, where write is hard and speak is easier, you may have automatic voice translations, where the first step is of course, voice recognition.

    When MS, Apple, Google release their browsers with build in screen reads, automatic translation and speech recognition, firefox also need something, then can not start researching and developing only after the other release it, that is a catch up game you can never win.
    On the other hand, if you prepare solutions and fine-tune them, you can import then in the browser, some times even first than others. If they are first to market in developing a speech recognition w3c standard, they can block closed and patented solutions

    --
    Higuita
  15. Why is Firefox CPU use and memory use unstable? by Futurepower(R) · · Score: 1, Insightful

    This is a HUGE issue: Firefox continually increases the CPU power and memory it uses, even when you aren't looking at a Firefox window. Why? What is Firefox doing? Bitcoin mining?

    Why does Firefox use so much memory when there are only a few tabs open? Why does Firefox increase memory use when it is not being viewed?

    1. Re:Why is Firefox CPU use and memory use unstable? by sysrammer · · Score: 1

      Are you new to FF or something? It's done that before there were bitmines.

      --
      His ignorance covered the whole earth like a blanket, and there was hardly a hole in it anywhere. - Mark Twain
    2. Re:Why is Firefox CPU use and memory use unstable? by roca · · Score: 1

      1) It could be the fault of a Web site that you have open. A browser can't easily distinguish "valid" from "invalid" Web site resource usage.

      2) It could be a Firefox bug, but if it is, it certainly doesn't happen for everyone or even most people. If it did, it would have been fixed already.

    3. Re:Why is Firefox CPU use and memory use unstable? by Anonymous Coward · · Score: 0

      no it doesn't you goofy fucking windows users...

  16. Open Search by Botnet-of-People · · Score: 1

    Another area Mozilla needs to look into is web search, which at the moment is a more prominent part of the browser experience (besides Facebook ;). Maybe a collaboration between the Wikipedia and Mozilla foundations and other like minded outfits is in order. Search is dominated by a similarly small set of major players (Google, Baidou, Yandex, perhaps even Bing) who may all be using the same basic algorithms but tweaked with parameters that only their paymasters know of and can control. So while search is no longer rocket science, we often get subtly biased results.

    1. Re:Open Search by roca · · Score: 1

      I'm sure it has been discussed, but it's a very difficult business to get into. Look at how Microsoft search quality has struggled for years with a much greater investment than Mozilla could ever afford.

    2. Re:Open Search by Botnet-of-People · · Score: 1

      LOL. Mozilla isn't a business, at least not in the sense in which MS, Apple, Google or even Elon's companies are. Mozilla looks like it's designed to lose Money. So another shirt-losing investment won't hurt.

  17. Re: I'm not interested in giving Mozilla money by Anonymous Coward · · Score: 0

    It's a 15" HP laptop and yes it is the cheapest thing I could buy because I ONLY use it for the internet an occasional web browsing. It's less than a year old. My other non internet connected computers have up times approaching a year

  18. I do. by Anonymous Coward · · Score: 0

    The server is publicly available. Yes it is a bit of a hassle to set up, and quickly reverse engineering the protocol to replace it with a small script would probably be better, but you can, and I do.

  19. 99% of the people are complete morons. by Anonymous Coward · · Score: 0

    Like, for example, those, who use "argument from popularity" fallacies. *nudge, nudge* *wink, wink*

  20. Re:I'm not interested in giving Mozilla money by Anonymous Coward · · Score: 0

    They do spend most of their time and money on Firefox

  21. Re:The vast majority of their money.... by tepples · · Score: 1

    I read it as "Incompetence in web browser security has been the rule since they were called Netscape."

  22. Stop playing politics, Mozilla by Anonymous Coward · · Score: 0

    I would be fine giving Mozilla money if I knew it would all go back into promoting FOSS. But that is not the case, because Mozilla always wants to run around playing politics and alienating half the country.

  23. Re:The vast majority of their money.... by roca · · Score: 2

    Another example of just making stuff up.

    Actually it's quite irksome to read trolling like this given I spent most of my long, paid employment at Mozilla fixing bugs, including security issues and worked with hundreds of dedicated colleagues also doing that.

    Firefox's security record is not much different from other Web browsers, and better than some. And it's getting even better now that the latest Firefox releases have quite good content-process sandboxing.

  24. Yes, Firefox has always been unstable. by Futurepower(R) · · Score: 1

    I reported the instability in the early days of Firefox. Lately, however, the instability seems to have become worse. By far the worst problem with Firefox is that it sometimes makes the Windows OS unstable.

  25. Re:The vast majority of their money.... by Anonymous Coward · · Score: 0

    "Firefox's security record is not much different from other Web browsers" - this might be the case for a naked browser, but it neglects Firefox's greatest strength - addons. I'd bet that an old version of Firefox running cookie, javascript and cross-site request blockers along with a traditional adblocker is far more secure than the current version of whatever MS, Google or Apple push out.

    For me, the vast array of addons are what push Firefox above the other browsers and why I continue to use it. It's also why I see no need to run on the update treadmill and why I don't care that much if the current Firefox breaks some addons - my older version works perfectly fine right now, I'm fairly secure, so why mess with it?

    But in any case - thank you for working on Firefox. I appreciate you keeping it alive, even if others may not.

  26. Firefox is unstable with many windows and tabs. by Futurepower(R) · · Score: 1

    "... it certainly doesn't happen for everyone or even most people."

    I need to do a LOT of research. I often open windows and tabs in Firefox and then need to think about what I've seen, so I leave the windows and tabs open.

    Then I do other research. That often results in having many windows and tabs open. Soon Firefox begins grabbing CPU power and memory. Eventually the Windows 7 Ultimate OS becomes slow. Sometimes it appears that Firefox has made Windows unstable.

    Pale Moon 64-bits seems more stable than Firefox 56.0.2, so I use Pale Moon.

    Waterfox sometimes brings up a message from anti-malware software I use, "Waterfox wants to act as a server." Scary.

    It seems to me that Microsoft's payments to Mozilla Foundation, through Yahoo, have been successful at doing something Microsoft wanted, apparently. During Microsoft's involvement, Firefox has been degraded by making it impossible to use popular Firefox add-ons. Yes, I accept that there have been improvements in Firefox. However, it seems to me that the transition was handled badly. Maybe that was the intention of someone wanting to lower the usage of Firefox.

    1. Re:Firefox is unstable with many windows and tabs. by Anonymous Coward · · Score: 0

      try running firefox with the umatrix plugin and only allow the javascript that's actually needed

      it'll promptly stop having ever increasing memory usage
      in other words this is a bad javascript issue, not a firefox issue per se