Microsoft Claims Its Speech Transcription AI is Now Better Than Human Professionals (qz.com)

← Back to Stories (view on slashdot.org)

Microsoft Claims Its Speech Transcription AI is Now Better Than Human Professionals (qz.com)

Posted by msmash on Tuesday October 18, 2016 @06:50AM from the breakthrough dept.

Microsoft announced today a system that can transcribe the content of a phone call with "the same or fewer errors" than real actual human professionals trained in transcription -- even when the human transcript is double-checked by a second human for accuracy. As you can imagine, this is a huge milestone for speech recognition. From a Quartz report:The team doesn't attribute this achievement to any breakthrough in algorithm or data, but the careful tuning of existing AI architectures. To test how their algorithm stacked up against humans, first researchers had to get a baseline. Microsoft hired a third-party service to tackle a piece of audio for which they had a confirmed 100 percent accurate transcription. The service worked in two stages: one person types up the audio, and then a second person listens to the audio and corrects any errors on the transcript. Based on the correct transcript for the standardized tests, the professionals had 5.9 percent and 11.3 percent error rates. After learning from 2,000 hours of human speech, Microsoft's system went after the same audio file -- and scored 5.9 percent and 11.1 percent error rates. That minute difference ends up being about a dozen fewer errors. Microsoft's next challenge is making this level of speech recognition work in noisier environments, like in a car or at a party. This implementation is crucial for Microsoft, and goes well beyond just transcription.

98 comments

Min score:

Reason:

Sort:

Microsoft? by QuietLagoon · 2016-10-18 06:53 · Score: 0, Troll

Isn't that the dying PC company?
1. Re:Microsoft? by Anonymous Coward · 2016-10-18 07:08 · Score: 0
  
  Isn't that the dying PC company?
  The one that has to use malware tricks to get people to upgrade? Doesn't seem like that's the tactic of a healthy company, but that's none of my business....
2. Re:Microsoft? by QuietLagoon · 2016-10-18 07:13 · Score: 0
  
  ... has just wasted all of our time...
  My apologies for my error. I should have asked: Isn't that the dying operating system company?
3. Re:Microsoft? by Anonymous Coward · 2016-10-18 07:20 · Score: 0
  
  They are a relic from yesteryear when lots of people still used desktop computers.
4. Re:Microsoft? by Opportunist · 2016-10-18 07:45 · Score: 3, Funny
  
  Hush! As long as MS exists, I have total job security!
  
  --
  We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
5. Re:Microsoft? by Anonymous Coward · 2016-10-18 07:46 · Score: 0
  
  I prefer the moniker "Defunct Phone Company" considering their OS does continue to still limp along.
6. Re:Microsoft? by InsertCleverUsername · 2016-10-18 09:42 · Score: 0
  
  This daily MS-bashing circle jerk, the nauseating fanboyism, and politically motivated anti-science goons are the three reasons I stopped using /.
  
  --
  Ask me about my sig!
7. Re: Microsoft? by Anonymous Coward · 2016-10-18 10:58 · Score: 0
  
  But wait, you just posted
8. Re: Microsoft? by InsertCleverUsername · 2016-10-18 11:11 · Score: 0
  
  Gotta love irony.
  Yeah, I drop in once or twice a year. Nothing's changed. Other tech sites have more straightforward news and Quora's comments have a much better signal to noise ratio.
  
  --
  Ask me about my sig!
9. Re: Microsoft? by Anonymous Coward · 2016-10-18 15:28 · Score: 0
  
  Is that because you don't post on other sites?
  You're certainly adding more noise than signal here. So complaining about it is rather hypocritical.
10. Re: Microsoft? by Anonymous Coward · 2016-10-18 15:31 · Score: 0
  
  You're right, most office buildings are full of cubes where people sit and use tablets all day.
11. Re: Microsoft? by Anonymous Coward · 2016-10-18 17:33 · Score: 0
  
  Well, tablets are still more productive then their virus infested and bot ridden excel machines.
12. Re: Microsoft? by vivian · 2016-10-18 18:07 · Score: 1
  
  I thought it was bad the day I had to train some foreign workers up to replace me.
  At least they were human. IT'd be worse having to train up an AI to take your job...
Obligatory (i.e. doing the needful) by Anonymous Coward · 2016-10-18 06:53 · Score: 0

Better than Indian professionals. - FTFY
1. Re:Obligatory (i.e. doing the needful) by Opportunist · 2016-10-18 07:46 · Score: 1
  
  It's less them having trouble understanding me, it's more me having trouble understanding them. If MS built a speech recognition software that can translate the output of an Indian call center, my hat is off to them!
  
  --
  We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
11.1 vs. 11.3 percent by Anonymous Coward · 2016-10-18 06:56 · Score: 1

That minute difference ends up being about a dozen fewer errors.
If 0.2% is a dozen, then 1% is sixty, so 100% is six thousand errors.
Yikes.
No by Anonymous Coward · 2016-10-18 06:59 · Score: 0

that is all
This implementation is crucial for Microsoft.... by Anonymous Coward · 2016-10-18 07:00 · Score: 0

and the NSA.
Right ... by scunc · 2016-10-18 07:01 · Score: 4, Funny

I'll believe that when I ducking see it.
--
This comment was transcribed by Microsoft's new AI transcription software.
1. Re:Right ... by Bongo · 2016-10-18 18:19 · Score: 1
  
  I've taken to typing and saying "ducking" all the time anyway. Soon to be added as a new meaning in the dictionaries.
  Those ducks, always up to something nasty.
2. Re:Right ... by Quirkz · 2016-10-19 05:08 · Score: 1
  
  I've taken to typing and saying "ducking" all the time anyway. Soon to be added as a new meaning in the dictionaries.
  Those ducks, always up to something nasty.
  I used to have an office that overlooked a river. I can't speak for all ducks, but the resident mallards ... yes, they were almost always up to those types of things.
  
  --
  The Quirkz Handbook of Self-Improvement for People Who Are Already Pretty Okay
3. Re:Right ... by RockDoctor · 2016-10-26 11:22 · Score: 1
  
  Those ducks, always up to something nasty.
  Homosexual necrophiliac rape, if I recall correctly.
  
  Moeliker, C.W., 2001 - The first case of homosexual necrophilia in the mallard Anas platyrhynchos (Aves: Anatidae) - DEINSEA 8: 243-247 [ISSN 0932-9308]. Published 9 November 2001
  Yes, I do remember correctly, and it was indeed a Mallard doing the deed (and being done-unto, too).
  Almost unremarkable that it was a Dutch report, and was considered so remarkable that it took 6 years from event to publication.
  I'd not actually read TFP on this - though I knew of it. For future reference, the journal is "DEINSEA- ANNUAL OF THE NATURAL HISTORY MUSEUM, ROTTERDAM P.O. Box 23452, NL-3001 KL, Rotterdam, The Netherlands" and they keep the paper here.
  
  --
  Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
Voice Control by Rockoon · 2016-10-18 07:02 · Score: 5, Insightful

If you want voice input to be more than just a toy, then getting near flawless accuracy here seems to be a required first step.

If your mouse occasionally sent an erroneous input to the computer no matter how careful you were, you wouldnt use it so much.

--
"His name was James Damore."
1. Re:Voice Control by TFlan91 · 2016-10-18 07:08 · Score: 2
  
  Agreed, however people down south don't move their mouse with "the typical hospitallllity of us folk 'round here" as opposed to the people up north who couldn't give a rats ass.
  Speech is incredibly dense to parse. Where a near perfect operation is required for a mouse, voice control can have a couple bumps in its' road before (and while) being highly adopted.
2. Re:Voice Control by stephanruby · 2016-10-18 07:41 · Score: 2
  
  If your mouse occasionally sent an erroneous input to the computer no matter how careful you were, you wouldnt use it so much.
  Wrong example. Mouse usability requires constant visual feedback and almost constant human correction. That is the reason why we can't really use a mouse without looking directly at the screen.
  In any case, flawless transcription accuracy of one single human voice out of 7.5 billion voices already happens with Google Voice. The problem occurs when Google Voice is not tuned to the voices of the other 7.49999 billion people. Do you think that's what Microsoft is using in the backend this second time around?
3. Re:Voice Control by Opportunist · 2016-10-18 07:55 · Score: 1
  
  This!
  We have input today that is perfect. More important, we sometimes have to do input that can break hours if not days of work if executed wrongly. Hitting the wrong key at the wrong time can at least be chalked off as human error, Saying "down" do scroll and it being interpreted as "shutdown" (along with the frustrated "NO, dammit" being interpreted as the answer to "save work (y/n)?") is more a problem of the input parser than the human in front of the screen.
  Unless it is AT LEAST at par with other means of input, there is very little reason for anyone to switch.
  
  --
  We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
4. Re:Voice Control by GameboyRMH · 2016-10-18 07:55 · Score: 1
  
  If your mouse occasionally sent an erroneous input to the computer no matter how careful you were, you wouldnt use it so much.
  And yet touchpads are still vastly more common on laptops than trackpoints...
  
  --
  "When information is power, privacy is freedom" - Jah-Wren Ryel
5. Re: Voice Control by Anonymous Coward · 2016-10-18 08:29 · Score: 0
  
  Must be a patent issue.
any better than "Show me to buy milk"? by itsme1234 · 2016-10-18 07:02 · Score: 2

Like any human would think about milk or "open reminders" when hearing "Show me my most at-risk opportunities".
1. Re:any better than "Show me to buy milk"? by LynnwoodRooster · 2016-10-18 08:08 · Score: 1
  
  "Show me my most at-risk opportunities".
  Huh, you mean Xiaomi is coming out with moist asterisks? How very interesting!
  
  --
  Browsing at +1 - no ACs, I ignore their posts. So refreshing!
Strange success criteria by DoofusOfDeath · 2016-10-18 07:02 · Score: 1, Interesting

Dialog windows: "Do you want to register for your FREE Windows 10 Upgrade?"
Me (vocally): "No, no... of for the love of all that's sacred, NO!"
Windows: "This may take a while. Please do not power down your computer ..."
1. Re:Strange success criteria by iggymanz · 2016-10-18 07:04 · Score: 1
  
  customer relations record: the customer loves windows as if it's the most sacred thing to him
2. Re:Strange success criteria by Anonymous Coward · 2016-10-19 05:44 · Score: 0
  
  So, just as reliable as the traditional mouse click then (in this specific circumstance?)
Now put it to good use! by cmiller173 · 2016-10-18 07:04 · Score: 4, Informative

Automated closed captioning for the hearing impaired would be one. I'm not hearing impaired, but I use the CC system with the volume low when I am watching TV while everyone else in the house is sleeping. I also use it when everyone is awake and noisy. It is amazing how awful some CC can be.
1. Re:Now put it to good use! by yagu · 2016-10-18 07:44 · Score: 1
  
  It is amazing how awful some CC can be.
  At first I thought, based on your post you'd really meant to say: "It is amazing how awesome CC can be."
  Interestingly, both are true.
2. Re:Now put it to good use! by pipingguy · 2016-10-18 08:17 · Score: 1
  
  Yes, I've noticed this too. I've often wondered if some CC is done by machine or just illiterates.
3. Re:Now put it to good use! by Anonymous Coward · 2016-10-18 08:33 · Score: 0
  
  Turn on Youtube's autocaptions to see how much worse machine CC is compared to CART reporter/ Palantypist closed captioning.
4. Re:Now put it to good use! by Anonymous Coward · 2016-10-18 08:34 · Score: 0
  
  I've often wondered if some CC is done by machine or just illiterates.
  Based on some of the awful CC I've seen, I've always assumed machines, but you make a good point; I shouldn't assume.
5. Re:Now put it to good use! by Anonymous Coward · 2016-10-18 08:42 · Score: 0
  
  Both cases are in common use. The CC data is delivered to the broadcaster with the video, so it is up to the video producer to get it done. Some shows don't bother.
6. Re:Now put it to good use! by CODiNE · 2016-10-18 09:07 · Score: 1
  
  Oh yes, my body is ready.
  And please make an API for all those horrible podcast and audioblog sites out there that make me miss out on industry trends.
  And maybe... talk to Google about YouTube CC.
  *blech!*
  
  --
  Cwm, fjord-bank glyphs vext quiz
7. Re:Now put it to good use! by Anne+Thwacks · 2016-10-18 09:18 · Score: 1
  
  by machine or just illiterates
  No. This is a whole new technology: artificial stupidity. Its going to change the world, I tell you. (Mostly for the worse, I suspect!)
  
  --
  Sent from my ASR33 using ASCII
8. Re:Now put it to good use! by somenickname · 2016-10-18 09:36 · Score: 1
  
  I'd love to see a YouTube feature that allows you to get the automatically generated transcript of a video without having to actually watch the video. For videos that are intended to be informative, having the transcript and grepping it for keywords and the context they are used would help you determine if it's worth watching a lengthy video. It maybe even just outright give you the information you want without having to sit through a half hour video.
9. Re:Now put it to good use! by iczer1 · 2016-10-18 09:55 · Score: 1
  
  Caption fails (old but funny):
  Make a short skit, act it out, take the CC output and redo the skit with the new words.
  https://www.youtube.com/watch?...
10. Re:Now put it to good use! by Anonymous Coward · 2016-10-18 14:14 · Score: 0
  
  Microsoft has been doing the for almost a decade with augmented CC for all corporate events. It can be quite good, but has to be manually correct once in a while.
11. Re:Now put it to good use! by antdude · 2016-10-18 16:27 · Score: 1
  
  I wished more of those CCs were manually typed out by humans.
  
  --
  Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
12. Re:Now put it to good use! by RabidReindeer · 2016-10-18 23:11 · Score: 1
  
  Based on Spanish-language soundtracks, there's no doubt that some CC is human-generated. On my Stargate discs, the foreign-language captions aren't even saying the same sentences as the alternate-language voices.
13. Re:Now put it to good use! by RabidReindeer · 2016-10-18 23:13 · Score: 1
  
  I've begun to suspect that YouTube is often used by the lazy and illiterate to to avoid actually taking the effort to type and format what should realistically have been text articles.
14. Re:Now put it to good use! by AmiMoJo · 2016-10-18 23:36 · Score: 1
  
  It should really improve YouTube too. Having an accurate transcription of a video means it becomes much more searchable than if all you have is the title and summary text. The current automatic transcription on YouTube is nearly useless.
  
  --
  const int one = 65536; (Silvermoon, Texture.cs)
  SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
15. Re:Now put it to good use! by Anonymous Coward · 2016-10-19 02:37 · Score: 0
  
  Some CC is done by machine, but it's completely unusable (Youtube autocaptions). Even human-generated captions are often unusable because they are transcribed out of context. E.g. Coursera English captions always misspell names because apparently they don't bother to look at the fucking video or other course material while transcribing.
16. Re:Now put it to good use! by Anonymous Coward · 2016-10-19 02:43 · Score: 0
  
  Automated closed captioning would also be great for language learners, for subbers and for searching text in videos. Unfortunately the existing speech-to-text technology is completely unusable. If they really got this to work that would be admirable, but I still doubt it. (How did they choose the videos they tested their system on? Not random [whatever the Microsoft equivalent for Youtube is] videos I guess?)
17. Re:Now put it to good use! by EndlessNameless · 2016-10-19 03:23 · Score: 1
  
  It will be decades before artificial stupidity is anywhere near natural stupidity on any metric.
  Natural stupidity is surprisingly flexible and resilient---it can crop up anywhere and is almost impossible to stop.
  Artificial stupidity requires significant investment and evolutionary design before it can approach the persistence and impact we see naturally.
  
  --
  
  ---
  According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
18. Re:Now put it to good use! by Quirkz · 2016-10-19 05:13 · Score: 1
  
  Yeah, came here to say that. We usually have ours on, and I can't seem to resist reading it. The frequency of errors and quirks is such that I've nearly started making a list of the worst ones. Any show from England tends to have "[indecipherable]" stuck in repeatedly, even when I would have said the language was perfectly clear.
  One of my favorites was "read my copy of At Last Shrub" which turned out to be "Atlas Shrugged".
  
  --
  The Quirkz Handbook of Self-Improvement for People Who Are Already Pretty Okay
19. Re:Now put it to good use! by Anonymous Coward · 2016-10-19 05:48 · Score: 0
  
  Thank you MS CC Bot.
Microsoft Claims by Anonymous Coward · 2016-10-18 07:06 · Score: 0

Will they use it on their own hyped up marketing?
"content of a phone call" by Anonymous Coward · 2016-10-18 07:11 · Score: 0

Who makes voice calls anymore?
I remember the good old vista days... by Anonymous Coward · 2016-10-18 07:13 · Score: 0

Dear Aunt, let's set so double the killer delete select all
Not superior by Anonymous Coward · 2016-10-18 07:15 · Score: 0

Microsoft is incorrectly interpreting the results. A more accurate conclusion would be that they achieved equivalent performance. Superior performance would require that the error rates for the AI be substantially lower than those of humans. They're not, they're nearly identical.
C'mon guys by diesalesmandie · 2016-10-18 07:21 · Score: 1

Say what you want about Microsoft (and some of it is true) but this is progress, even if they (maybe) cherry picked the one trial that had the lowest difference in error rate between the algorithm and a human...

--
This is my sig, there are many like it but this one is mine
HAL by Anonymous Coward · 2016-10-18 07:22 · Score: 0

Can it read lips?
Cherry picked by Anonymous Coward · 2016-10-18 07:28 · Score: 0

Speaking as a professional who has worked in this field, this screams of a cherry picked scenario. The margin of success falls well within the bounds of the statistically insignificant variability I would expect to see in SR systems (human or otherwise). In the article they admit to eliminating audio which would favor humans over machines (noisy environments, etc). This kind of PR release produces good short term feelings but in the long term makes Microsoft (and computer science people in general) look like myopic, self-important ignorant twits.
Question: how did they find the errors that the two-human team missed? Presumably with a third human. Does this mean a three-person team can beat out both a two-person team and ASR? Or was there a script that was used to generate the audio? That would raise other questions, such as the accuracy of the speakers.
1. Re:Cherry picked by dpidcoe · 2016-10-18 07:35 · Score: 1
  
  Question: how did they find the errors that the two-human team missed? Presumably with a third human. Does this mean a three-person team can beat out both a two-person team and ASR? Or was there a script that was used to generate the audio? That would raise other questions, such as the accuracy of the speakers.
  I had the same question. We ran into a similar problem in a school project making an AI that interpreted results from a polysomnogram. In theory we got over ~90% accuracy, but different humans would score the same sleep study differently, which basically meant that humans got 90% accuracy compared to each other too.
2. Re: Cherry picked by Anonymous Coward · 2016-10-18 07:43 · Score: 0
  
  You could have the original speakers do the transcription, or have them read from prepared texts.
They have a 100% accurate translation? by ewibble · 2016-10-18 07:31 · Score: 1

How do you get 100% accurate translation anyway? These things are up to interpretation, not all words have an exact translation, meaning is more important than the actual correct words. Language is ambiguous.
Also every error is not necessarily equal, some errors are irrelevant, while some are more important, e,g. the quick car, or the fast car, mean the same thing.
1. Re:They have a 100% accurate translation? by saider · 2016-10-18 09:25 · Score: 1
  
  Quick and fast are easier to discriminate than "fast" and "fat".
  Consider the following iterative algorithm...
  "That is a fast car" - is translated to
  That is a fat car *Context filter - strict vs slang - replace fat with phat*
  *Context filter - apply ghetto style - replace "That" with "Dat"*
  *Context filter - apply ghetto style - replace "is" with "be"*
  *Context filter - apply ghetto style - replace "a" with "one"*
  That be one phat car.
  
  --
  
  Remember, You are unique...just like everyone else.
2. Re:They have a 100% accurate translation? by RabidReindeer · 2016-10-18 23:15 · Score: 1
  
  How do you get 100% accurate translation anyway? These things are up to interpretation, not all words have an exact translation, meaning is more important than the actual correct words. Language is ambiguous.
  Also every error is not necessarily equal, some errors are irrelevant, while some are more important, e,g. the quick car, or the fast car, mean the same thing.
  Who gave you free reign to make such assertions? You need to tow the line or we'll see to it that you loose your posting privileges here!
3. Re:They have a 100% accurate translation? by Anonymous Coward · 2016-10-19 05:52 · Score: 0
  
  And a "quick" woman (smart, witty) would be very different than a "fast" woman.
The subject is "transcription" not "translation". by JMZero · 2016-10-18 07:35 · Score: 1

Transcription is obviously a lot more straightforward, and the goalposts should be pretty easy to set.

--
Let's not stir that bag of worms...
Microsoft Lies. Case Closed. by Tablizer · 2016-10-18 07:37 · Score: 1

Why should anyone trust Microsoft? They lie about surveillance, they lie about being "open", they lied about Windows 10 install options, they lied about Windows Tablet having a bigger screen than iPad, they lied to Steve Jobs about their GUI plans in the 80s, etc.
640 lies oughtta be enough for anyone. Ignore them by now.

--
Table-ized A.I.
1. Re:Microsoft Lies. Case Closed. by David_Hart · 2016-10-18 08:31 · Score: 1
  
  Why should anyone trust Microsoft? They lie about surveillance, they lie about being "open", they lied about Windows 10 install options, they lied about Windows Tablet having a bigger screen than iPad, they lied to Steve Jobs about their GUI plans in the 80s, etc.
  640 lies oughtta be enough for anyone. Ignore them by now.
  Just to begin with, they have been working on this for a while...
  https://www.youtube.com/watch?...
Govt Survellience by mcolgin · 2016-10-18 07:40 · Score: 3, Insightful

I assume this is so the Govt agencies can transcribe cell-phone communications to text and then perform analysis to find all the "bad guys" ?

--
I made this: http://www.bpftpserver.com
How do you measure "success" man vs machine by Anonymous Coward · 2016-10-18 07:43 · Score: 0

Will man or machine determine? Stay tuned...
Captcha: revered
Not really... by Anonymous Coward · 2016-10-18 07:50 · Score: 0

Wake me up when they buy Nuance/eScription. (Used by Siri and Samsung S Voice)
Finally by Sperbels · 2016-10-18 07:53 · Score: 1

The machines can finally interpret our speech. Next step: launch all the missiles.
1. Re:Finally by John.Banister · 2016-10-18 10:21 · Score: 1
  
  I'm sorry, the missus can't do launch today. It's laundry day. Clippy says she might have time to come round at 11:45 tomorrow. Would that work?
Is it.... by downright · 2016-10-18 08:00 · Score: 1

based on that twitter chat-bot that turned racist and trollish in a matter of hours? I have been looking for a way to UTF-TRUMP encode my documents!
"Humans" by pipingguy · 2016-10-18 08:16 · Score: 1

"even when the human transcript is double-checked by a second human for accuracy"

Everything depends on how dumb the transcriber and/or checker is.
Defused by John+Jorsett · 2016-10-18 08:22 · Score: 3, Interesting

The acid test for transcription for me is if the transcriptionist gets the word "defuse" right, as in "He defused the tense situation." Every, and I mean EVERY, closed caption I've seen transcribes it as, "He diffused the tense situation." It seems to be the universal mistake.
1. Re:Defused by Anonymous Coward · 2016-10-18 08:39 · Score: 0
  
  Every, and I mean EVERY, closed caption I've seen transcribes it as, "He diffused the tense situation." It seems to be the universal mistake.
  I just watched the new Ghostbusters movie with subtitles turned on. You would think that they would have taken the time to have a human check the captions on something that relatively high-profile, yet whenever a reference was made to holo-lasers, the subs referred to "hollow lasers". They must have used the same technical reference as Back to the Future's 1.21 jigga-watts came from.
2. Re:Defused by gustygolf · 2016-10-18 19:34 · Score: 1
  
  My test goes like this:
  Dear Aunt
  Let's set so double the killer delete select all.
  
  --
  "Slow Down Cowboy! It's been 58 minutes since you last successfully posted a comment" -- slashdot, driving users away.
3. Re:Defused by ChrisMaple · 2016-10-19 01:14 · Score: 1
  
  FWIW, before common usage overwhelmed the original creation, "jigga" was the correct pronunciation of "giga".
  
  --
  Contribute to civilization: ari.aynrand.org/donate
4. Re:Defused by well_in_theory · 2016-10-19 12:00 · Score: 1
  
  What gets to me more is the choice of how to pronounce the value...
  No self-respecting scientist would ever say "one point twenty one". That's "one point two one." Or is 1.201 "one point two hundred and one" and thus more?
Transcription? by vizbones · 2016-10-18 08:24 · Score: 0

We use a M$ system here at work -- some of the funniest writing I've ever read was a simple, serious phone message "transcribed" by their software. I'd say, "I welcome our new M$ AI overlords" but it's better to read their transcription. "I will come are knew micro soft overloads"
I'm sure the NSA will be happy by Dunbal · 2016-10-18 08:25 · Score: 1

Now the NSA can store text transcripts of your conversations instead of having to store the audio files. This will leave so much more room for video! Hey - why did you put tape on your webcam, citizen?

--
Seven puppies were harmed during the making of this post.
Say what? by Anonymous Coward · 2016-10-18 08:26 · Score: 1

Like any human would think about milk or "open reminders" when hearing "Show me my most at-risk opportunities".
Eye thin queue meant two say:
Lie canny hue man wood thin cab out mill core "owe pen reminders" when he ring "Show meme I Moe stat risk copper tune it ease.
1. Re:Say what? by Anne+Thwacks · 2016-10-18 09:15 · Score: 1
  
  Eye thin queue meant two say: Lie canny hue man wood thin cab out mill core "owe pen reminders" when he ring "Show meme I Moe stat risk copper tune it ease.
  Hey! It looks like you have obtained illegal access to the system used to caption news broadcasts!
  
  --
  Sent from my ASR33 using ASCII
That's nice. by Anonymous Coward · 2016-10-18 08:57 · Score: 0

Now give me your accuracy rate when following a 28 year old physician from Northern China who has a tendency to turn his head away from the mic at random while dictating, and a 42 year old doctor from Southern India who always speaks too softly and a 36 year old from Georgia who tends to have speech patterns that the transcriptionist has to interpret what he actually meant.
Sorry, been told one too many times that voice recognition can mean that I can do away with transcriptionists to accept anything less than a 100% money back offer within the first year.
Well, there's a whole bunch by rsilvergun · 2016-10-18 09:01 · Score: 1

Of middle class jobs about to go caput.

--
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
What humans were these? by Anonymous Coward · 2016-10-18 09:36 · Score: 1

The humans had a 5.9% error rate AFTER proofreading by another person? That's either a lousy speaker, a terrible recording, or really bad transcription. That's not something to brag about, frankly. I used to get an error rate of under 2% with IBM ViaVoice back in 1994. This doesn't seem like progress to me.
Classic speech recognition failure by iczer1 · 2016-10-18 09:50 · Score: 1

Dear aunt, let's set so double the killer delete select all
https://www.youtube.com/watch?...
Wireless Headphones vs CC smh by Anonymous Coward · 2016-10-18 10:12 · Score: 0

Cheap $20 rechargeable wireless RF headphones work pretty well. I have 3 pairs in my house connected to different devices like TV, PC, home receiver. Someone in my house is always wearing wireless headphones. Spend the $20 and get enjoy TV with the sound. Watch TV, PC, HTPC at any hour in the family room without waking anyone up.
1. Re:Wireless Headphones vs CC smh by Hognoxious · 2016-10-18 19:27 · Score: 1
  
  No good when you live with a nutter who thinks they cause cancer.
  
  --
  Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Maybe in a "lab environment"... by Anonymous Coward · 2016-10-18 11:02 · Score: 0

I might believe those numbers if it's after 2000 hours of the same individual speaking in monotone at a consistent pace and that that individual is speaking identically to what was done during the "learning phase", using already "learned words".
If that's not the case, there's no chance of those numbers being even close to accurate.... I've seen some of the crap Cortana comes up with when I speak to it.
China by Anonymous Coward · 2016-10-18 14:06 · Score: 0

Didn't we have a story about this and a Chinese company the other month. Seems Microsoft is late to the party again.
Skype for Business...right.... by Anonymous Coward · 2016-10-18 22:23 · Score: 0

If it's anything like the speech transcription in their latest bugfix for Skype, it's a total joke.
Speech Injection by gizmod · 2016-10-18 23:29 · Score: 1

Jim: Hey there
Bot: Good day sir.
Jim: Semi colon drop table language
Bot:???????????
Re: Self-Reflection by InsertCleverUsername · 2016-10-19 01:29 · Score: 1

Have my criticism and observations upset you AC? Struck a nerve?

--
Ask me about my sig!
ROTFLMAO! by whitroth · 2016-10-19 05:40 · Score: 1

I have just this to say about that: folks, I wouldn't let alpha software out to users.
They brought in "hybrid" phones here last year (VOIP). For voicemail, it sends an mp3, and a "transcription". Frequently, the "transcription", "powered by Microsoft speech technology", resembles early "computer poetry". And by "early", I'm talking 1960s or '70s.... with significant portions bearing zero resemblance to what was said.
mark