Slashdot Mirror


Vista Speech Recognition Goes Awry

An anonymous reader writes "It seems even MSNBC is willing to take a jab on those rare occasions when Microsoft products don't work. During a demo of Vista's speech recognition technology, Vista couldn't differentiate between mom and aunt, and all attempts to rectify the problem just made it worse. Wait until you see what it spat out, I think we have a new 'All your base.' Don't you just love Microsoft's live demonstrations?"

11 of 418 comments (clear)

  1. The Voice of Experience by dacap · · Score: 5, Insightful

    Yes, once again Microsoft S/W Engineers learn that the more public the demo or the more important the audience, the more likely some will go wrong. It's one of Murphy's laws. Been there. Did that. Barely survived.

    Experience is the human quality that enables you to recognize a mistake immediately when you make it again.

    Dacap

    --
    English -- gotta love it! / The engineers refuse to refuse the rocket until the refuse is removed from the launch pad.
  2. Re:Awww...c'mon guys.... by kripkenstein · · Score: 4, Insightful

    Nothing to worry about, I'm sure they'll get all the kinks out by the time Vista is released - sometime in 2008 or so, it seems, based on this video.

    This was really a dreadful presentation. There was no ambient noise (as the commentators say later, and despite what Microsoft says), and there was no echo as the demonstrator claims during the actual test. It seems to have been done under really good test conditions, but still it failed miserably.

  3. Re:Awww...c'mon guys.... by tomstdenis · · Score: 5, Insightful

    Most likely the system was trained by an engineer and handed off to the ass in marketting. He was probably supposed to train it to his voice too but decided to hit the bar instead.

    Voice recognition requires some training regardless of who provides it. We're not Star Trek here....Prep work and rehearsal people. If mr. sales guy had tried the demo before the presentation he would have noticed it wasn't working and avoided the embarassment.

    This is why sales people are asshats. They're unprofessional non-technical people who sap back the high life while the rest of us have to put up with the mess they create through their daily barrage of verbal diarhea.

    Tom

    --
    Someday, I'll have a real sig.
  4. Re:are u serious? by tomstdenis · · Score: 4, Insightful

    Microsoft routinely puts out their excellence over everyone else including OSS. Hear them talk about Office w.r.t. OpenOffice. They talk down about it, mock it, dismiss it, etc...

    It's called modesty. If MSFT had any [and some humility] they wouldn't get laughed at so hard for this. I mean look at Linux. Find a bug in the Kernel, fix it, post notices that its. You don't see anyone saying "Oh hahaha, Linus is at it again!" That's because you also don't see Linus on CNN mocking the rest of the world.

    Microsoft deserves all the negative press and humilitation they get because they are shameless, deceitful, greedy monopolistic bastards.

    Tom

    --
    Someday, I'll have a real sig.
  5. Re:Is SR ever going to be good enough? by Skater · · Score: 4, Insightful

    The computer in Star Trek (at least in the Next Generation) was WAY too smart. For it to do what it supposedly did in the show, it would have to be sitting there, monitoring the conversation all the time, and be totally able to understand the context of what was being said to know what to do. Not only when people directly asked the computer a question, but also when people wanted to converse with someone.

    For example, how does the computer know that Picard wants to call Riker and isn't just talking about him? Oh and keep in mind the computer never misinterpreted something. In other examples, people would carry on intelligent conversations with the computer - all those holodeck scenes, Troi ordering chocolate, etc.

    Star Trek-style of SR I think would be the holy grail and is probably always going to be out of reach. Barring some amazing breakthrough in AI algorithms, the computer power required just for the situations above would be incredible - and that's computer time that probably could be put to better use elsewhere, even if it was found to be possible.

    I think the computer in the original Star Trek was more realistic - but even there the voice-recognition was far beyond what we're capable of today, as Microsoft has demonstrated so well. Plus all the blinkenlights that seemed to have no useful purpose were cool. ;)

  6. Re:Awww...c'mon guys.... by tomstdenis · · Score: 5, Insightful

    Generally, from what I've seen you need to train it a bit on the way you speak. There are thousands of distinct English accents and pronounciation variations.

    For instance, the word "patent" is pronounced differently in the UK from North America. In the UK it is "pay-tent" and over here it's "pah-tent". That's just one example.

    Point is [to paraphrase ballmer]:

    Preperation (clap), preperation (clap), preperation (clap), preperation (clap), preperation (clap), [pitch of voice higher], preperation (clap), preperation (clap), [wheeze out of breath, pitch even higher], preperation (clap), preperation (clap), yeah!!!

    Something tells me this sales guy will get neither punished nor lose their x-mas bonus. Some poor schmuck in engineering will take the fall for not making the demo "people ready".

    Tom

    --
    Someday, I'll have a real sig.
  7. removing ambient noise by sh0rtie · · Score: 4, Insightful



    why not just use two mics, one to record the ambient noise (positioned away from the voice mic) the other to record the voice (headset) then as you have two signals just subtract the ambient noise signal from the heaset signal , voila clean headset mic audio

    works for music too, you could control your music player by voice even when its playing loud (at a party) by removing the music signal from the mic signal

    -AJS

  8. Re:are u serious? by Udo+Schmitz · · Score: 4, Insightful
    "As if MS is the only one who has problems with demonstrations."

    Hmmm, no. Maybe it's the way they deal with failures. Remember Bill gates trying hard to demonstrate the Media Center? Some time after that Steve Jobs gave his regular Macworld keynote when his Mac didn't respond anymore. He moved a monitor switch to continue the presentation on another Mac and said: "Well, that's why we have backup systems here."

  9. Re:Awww...c'mon guys.... by jc42 · · Score: 4, Insightful

    There are thousands of distinct English accents and pronounciation variations.

    Aw, c'mon; how many English dialects pronounce "mom" and "aunt" similarly?

    Even to someone who's worked with voice recognition, that mistake simply isn't credible. If the software were anywhere near usable, it wouldn't confuse those words from anyone, especially not in a low-noise, no-echo demo.

    This is a "No excuses" situation. That demo was simply a dismal failure due to some major bug(s).

    Of course, the speech recognition field has a long history of staying in such a state forever. It's hard to find a product that, even with extensive training, doesn't produce howlers like this.

    I did like the "killer" part ...

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  10. Re:Awww...c'mon guys.... by tomstdenis · · Score: 4, Insightful

    I never said training was the only cause of the failure. I said it's likely that he didn't train it. Because most high powered sales people are just cocaine snorting asshats that make peoples lives miserable.

    Chances are he never even did a walk through of the presentation before the press was there.

    Tom

    --
    Someday, I'll have a real sig.
  11. Re:Oh Please by tomstdenis · · Score: 4, Insightful

    Who knows how the algorithm they implemented works. Chances are the computer scientists behind it are not total asshats and they assumed the sales guy would follow the same procedure they did [e.g. to train it].

    Point is, if the sales guy had tried the system out beforehand he would have noticed it not working.

    That is, suppose the code is total shit [I know, big stretch for MSFT]. Then isn't it likely it would have failed during the preparation stage? If you are saying "mom" and it always comes back "aunt" you may want to cancel the presentation.

    That's why I think he didn't do any prep work for the presentation.

    Tom

    --
    Someday, I'll have a real sig.