Slashdot Mirror


Microsoft Speech Recognition Now As Accurate As Professional Transcribers (techcrunch.com)

An anonymous reader quotes TechCrunch: Microsoft announced today that its conversational speech recognition system has reached a 5.1% error rate, its lowest so far. This surpasses the 5.9% error rate reached last year by a group of researchers from Microsoft Artificial Intelligence and Research and puts its accuracy on par with professional human transcribers who have advantages like the ability to listen to text several times. Both studies transcribed recordings from the Switchboard corpus, a collection of about 2,400 telephone conversations that have been used by researchers to test speech recognition systems since the early 1990s. The new study was performed by a group of researchers at Microsoft AI and Research with the goal of achieving the same level of accuracy as a group of human transcribers who were able to listen to what they were transcribing several times, access its conversational context and work with other transcribers.

8 of 176 comments (clear)

  1. Errors are not Errors by idji · · Score: 5, Insightful

    When a human transcriptionist makes a mistake you can usually work out what they meant. When Speech-to-text (STT) makes a mistake it is often gibberish. So objectively it is "better" at transcribing, but subjectively much worse.

    1. Re:Errors are not Errors by K.+S.+Kyosuke · · Score: 3, Insightful

      Hey, it's going to cost $700 per minute but at least there will be no errors!

      So it's about three times cheaper than the lawyer that you'd need if you get sued for a bad transcription?

      --
      Ezekiel 23:20
    2. Re:Errors are not Errors by gnick · · Score: 4, Insightful

      A legal transcriptionist requires different training then a Medical Transcriptionist.

      And sometimes even that training falls short. Does anyone remember the explosion at WIPP when the tech transcribed "an organic kitty litter" instead of "inorganic kitty litter"?
      Kitty litter explosion.

      --
      He's getting rather old, but he's a good mouse.
  2. Re:Laughable Hype by bobstreo · · Score: 3, Insightful

    You should start talking with people who don't speak gibberish.

    Yeah, but Mumbai is on the phone with us again...

  3. Re: Laughable Hype by Anonymous Coward · · Score: 2, Insightful

    We have a up to date Microsoft service doing this at my work. Accuracy is a running joke and I regularly forward people their transcriptions so we all get a good laugh. This might be lab quality recordings with limitations on launguage complexity used to cut down on errors. Error rate of a closed set test isnt really a great indicator. Now a year long comparison against several call centers in multiple industries would be quite compelling.

  4. Re:Bad experiences on this front by Anonymous Coward · · Score: 1, Insightful

    Words don't make a language, and C does not become English just by using some English words.
    Doing what you want is a completely different thing and would use a completely different algorithm, so at the very least it as rather off-topic to this article (mostly because things like phrases, grammar, context in general etc. don't apply, but are very important to creating a good natural language recognition).
    You are being rather arrogant about it considering you very much didn't seem to understand the poster or why his criticism is valid.

  5. Re:Laughable Hype by Luthair · · Score: 3, Insightful

    3) How much background noise? Are these from people calling from cell phones. Or a LAN line.

    Why does it matter? If it doesn't function in a standard operating environment then it isn't doing as claimed. What would you say to a watch maker who claimed their product was unscratchable but testing consisted of rubbing it with microfibre cloth?

  6. Re:Laughable Hype by pr0fessor · · Score: 3, Insightful

    3.... I've tried various voice recognition software over the years and can say they are getting much better but if there is any background noise forget it.

    I quit trying to use siri because when I get in the car and ask siri for directions if my wife is with me I get siri saying "I couldn't find, 102 why the fuck street don't you type in the address like a regular shut up person damn it.