Slashdot Mirror


Remote Exploit of Vista Speech Control

An anonymous reader writes "George Ou writes in his blog that he found a remote exploit for the new and shiny Vista Speech Control. Specifically, websites playing soundfiles can trigger arbitrary commands. Ou reports that Microsoft confirmed the bug and suggested as workarounds that either 'A user can turn off their computer speakers and/or microphone'; or, 'If a user does run an audio file that attempts to execute commands on their system, they should close the Windows Media Player, turn off speech recognition, and restart their computer.' Well, who didn't see that coming?"

82 of 372 comments (clear)

  1. Most Important Part of the Announcement by eldavojohn · · Score: 5, Funny

    Microsoft cautioned everyone not to play the song "Hit Me Baby One More Time" by Britney Spears on or near your computer while the mic is on.

    Several lawsuits already involve brutal crimes by computers against annoying young teeny bopper women. Although we can't act like we didn't see this coming, tension has been steadily rising.

    --
    My work here is dung.
    1. Re:Most Important Part of the Announcement by kannibal_klown · · Score: 5, Funny
      Worse yet!!!

      One of the computer geeks at the Pentagon better not be watching any Star Trek episodes.

      Computer. Initiate self destruct sequence. Authorization 1A 2B 3C
    2. Re:Most Important Part of the Announcement by joshetc · · Score: 5, Funny

      Microsoft cautioned everyone not to play the song "Hit Me Baby One More Time" by Britney Spears on or near your computer while the mic is on.

      Several lawsuits already involve brutal crimes by computers against annoying young teeny bopper women. Although we can't act like we didn't see this coming, tension has been steadily rising [theonion.com]. You should see what happened to the guy who played the Nirvana song "Rape Me".
    3. Re:Most Important Part of the Announcement by Anonymous Coward · · Score: 5, Funny

      Authorization 1A 2B 3C
      Hey! That's the authorization code on my luggage!
    4. Re:Most Important Part of the Announcement by plopez · · Score: 2, Funny

      who wants Vista?
      billg, ballmer, hardware manufactures, virus writers, anti-virus vendors, spam bot operators, antispam software writers.... oh, you meant *humans*... in that case, none.

      --
      putting the 'B' in LGBTQ+
    5. Re:Most Important Part of the Announcement by darthnoodles · · Score: 3, Funny

      I'm guessing they were already raped when Vista was installed.

    6. Re:Most Important Part of the Announcement by Opportunist · · Score: 2, Funny

      Anti-Virus vendors certainly don't want Vista. You have NO idea what headache that system means to you if you have to include anything remotely resembling a driver in your product.

      Personally, I'd be VERY happy if it vanished faster than it appeared. Erh... ok, considering the development time that isn't such a strong statement, but I'd be happy if it vanished faster than it installs. Erh... if it vanished faster than it boots. Erh...

      Damn, can someone come up with a suitable analogy?

      --
      We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
    7. Re:Most Important Part of the Announcement by asCii88 · · Score: 2, Funny

      You meant, if it vanished fasther than its first bug is found?

    8. Re:Most Important Part of the Announcement by Zonnald · · Score: 2, Funny
      And of course you proceed that with.

      "Hey, Colin, check out my new 'Start, Run, CMD, Enter'" (wtf) "Oh, I like Format C, Colin." (turns to the doorway where Bob has just arrived) 'Enter, Yes, Bob'.

      Really it would seem a little bit more complicated then just throwing (or Squirting (tm)) a random phrase at the computer. I would imagine that the application with focus has to be able to interpret the phrase.

    9. Re:Most Important Part of the Announcement by netsharc · · Score: 4, Funny

      Anyway, typing "format C:" in a running Windows doesn't work, because it will say "The volume is in use." (assuming Windows is on C:)...

      Don't believe me? Try it yourself. ;-)

      --
      What time is it/will be over there? Check with my iPhone app!
  2. Yell Commands Across the Room by ehaggis · · Score: 5, Funny

    Is that a remote exploit?

    --
    One ring to bind them - should probably have more fiber and less rings in their diet.
  3. That's hardly an exploit by kahei · · Score: 4, Insightful


    Taking a computer that obeys audio instructions, and playing it some audio instructions, is more of a 'duh' than an 'exploit'. But this problem is a very Good Thing. It can only mean:

    -- EITHER people stop yakking on about voice computing, which has been the Way Of The Future since about 1935 or something
    -- OR pressure is exerted on web designers to NOT make sites that start making noise the moment the page appears!

    Either of these, but especially the latter, would be a big win. So here's to you, Mr. Exploit Finding Man!

    --
    Whence? Hence. Whither? Thither.
    1. Re:That's hardly an exploit by just_another_sean · · Score: 5, Funny

      So here's to you, Mr. Exploit Finding Man!

      Now there's a Bud commercial I'd like to hear.

      --
      Creationist Textbook Stickers Declared Unconstitutional by CowboyNeal
    2. Re:That's hardly an exploit by Anonymous Coward · · Score: 2, Insightful

      Even so, with Vista's new software audio stack, this is inexcusable. It should have been trivial to compare the input and output signals and filter out most of this automatically.

    3. Re:That's hardly an exploit by gstoddart · · Score: 4, Insightful

      -- EITHER people stop yakking on about voice computing, which has been the Way Of The Future since about 1935 or something
      -- OR pressure is exerted on web designers to NOT make sites that start making noise the moment the page appears!
      Or, we make browsers so they don't run every damned audio file, flash frigging plugin, executable, movie, or whatever that the idiot who made the site thinks I should hear/see/play with/click/download/execute or whatever.

      There has never been any sound from a webpage that didn't make me want to immediately beat the person who wrote it with his own leg. I don't want to listen to your stupid MIDI file of whatever the fsck you think is cool on your web page.

      There was never any good reason to embed sounds in web pages unless you have to click a button to specifically play it.

      Cheers
      --
      Lost at C:>. Found at C.
    4. Re:That's hardly an exploit by morgan_greywolf · · Score: 2, Insightful

      or default to If playing audio then audio instructions listener = off
      Yes: for all of you fanbois out there saying "Oh, that's not an exploit!" pay attention to what the parent is saying! You gotta admit, it was huge oversight on Microsoft's part to not include any mechanism for turning off the accepting of audio instructions while playing audio, or at least to have a user-configurable option for protection against this exploit, defaulted to "On".

      This is yet another case of Microsoft putting ease-of-use ahead of security and reliablity. We've all heard this song before. Same story, different Windows version.

    5. Re:That's hardly an exploit by VertigoAce · · Score: 5, Informative

      The audio mixer in Vista is no longer based on different audio types (MIDI, CD Audio, WAV, etc). Instead, there is a volume slider and mute button for each application that makes sounds. So you can mute IE, AIM (those annoying video ads), and Windows itself, while still playing your music in WinAmp or WMP.

    6. Re:That's hardly an exploit by bloobloo · · Score: 4, Funny

      Never? Not even Bananaphone?

    7. Re:That's hardly an exploit by xero314 · · Score: 3, Interesting

      Couldn't the system simply have a filter that removes the wave signature of what it is outputting before processing input as a command? This is relatively simple technology, as compared to voice recognition itself. You might have to re-calibrate if you move your speakers but I would think that is a small price to pay to not leave open the ability for a web site to control your system through an auto-playing wave file.

      Mind you this won't stop your roommate from yelling "Shut Down...Yes" just to piss you off. Or worse yet the guy you just fired yelling something more destructive on his way out of the office.

    8. Re:That's hardly an exploit by Wannabe+Code+Monkey · · Score: 2, Interesting

      The audio mixer in Vista is no longer based on different audio types (MIDI, CD Audio, WAV, etc). Instead, there is a volume slider and mute button for each application that makes sounds. So you can mute IE, AIM (those annoying video ads), and Windows itself, while still playing your music in WinAmp or WMP.

      If that's true, then that's awesome. I remember a couple years ago reading a story on slashdot about various experimental usability projects going on at Microsoft and this was one of them. I think they even put together a mock desktop in flash where they implemented this volume system that you could play with. From a usability standpoint it was way better. I had assumed that this was something that just got lost along the way, but I'm glad to see they went through with it.

      --
      We always knew Comcast was corrupt, here's the proof: http://tech.slashdot.org/comments.pl?sid=1909890&cid=34545432
    9. Re:That's hardly an exploit by Lanoitarus · · Score: 5, Funny

      Bud Light Presents...
      Real American Heroes (reaaalllll american heroooessss...)
      Today we salute you, Mr Computer Software Exploit Finder (computer software exploit fii-inder)
      While others are wasting away their lives drinking, dating, and and having fun, you're hunched over a screen, plowing through code.(hunch plow hunchie plow)
      You may not have seen the sun in days, but thats ok- you do this for the greater good.(greaaater goooo-ooodd)
      Only YOU could realize that a carefully crafted web favorites icon could potentially bring the world to its knees.(Down on its kneeee--eesss)
      So crack open an Ice Cold Bud Light, Oh Overload of Overflow, because without you, CmdrTaco would have to get a real job.

    10. Re:That's hardly an exploit by DrSkwid · · Score: 3, Informative

      here's how to do something similar in plan9

      mkdir /n/mute
      bind /n/mute /dev/audio
      run_noisy_application

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    11. Re:That's hardly an exploit by complete+loony · · Score: 2, Funny

      badger badger badger ...

      --
      09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    12. Re:That's hardly an exploit by pclminion · · Score: 2, Informative

      Couldn't the system simply have a filter that removes the wave signature of what it is outputting before processing input as a command? This is relatively simple technology, as compared to voice recognition itself. You might have to re-calibrate if you move your speakers but I would think that is a small price to pay to not leave open the ability for a web site to control your system through an auto-playing wave file.

      The quick answer is "no." Even though the computer knows what waveform it is playing, it has no idea what waveform will actually emerge from the speakers, or arrive at the microphone.

      The problem is that the audio system taken as a whole (Sound card DAC -> speaker wire -> speaker driver -> air in the room -> microphone pickup -> microphone wire -> sound card ADC) introduces small but significant spectral distortion into the sound by the time it runs through the entire system. Even if we ignore the nonlinearities of the amplifiers, the finite resolution of the digital-to-analog converters, and everything else, we still run into the problem of objects MOVING in the room (like you, leaning 2 inches forward in your chair), which changes the impulse response of the system and therefore changes the spectrum of the received signal.

      Even if we consider only two elements, the speaker cone and the air in the room, it is fairly easy to see that the sound wave generated is NOT equivalent to the wave being sent to the speaker cone. Imagine a step signal (e.g. a Heaviside function) where the speaker deflection instantaneously goes from 0 to 1, then stays there. What does the AIR PRESSURE right next to the speaker cone do? Does it instantaneously jump from 0 to x and then stay there? No, of course not -- a WAVE propagates from the speaker into the air of the room. So the signal applied to the speaker and the signal in the room are not the same signal.

      Now in theory, if all of these effects are linear, then the total impulse response can be computed. This is the "calibration" you mention. The problem, though, is that the system is not TIME INVARIANT, meaning its impulse response changes with time simply because of all the variables which affect the system.

      So it's not only a matter of "recalibrating when you move your speakers." You have to recalibrate when the speakers move, when the temperature changes, when the air pressure changes, when the microphone moves, when the microphone has dust on it interfering with pickup, when anything at all in the room moves, when there is a draft in the room, etc etc.

      This would not be simple technology at all. Not impossible, but probably extremely expensive and unreliable.

  4. I tried to replicate the bug, but all I got was by knightmad · · Score: 5, Funny

    c:> Dear aunt, let's set so double the killer delete select all: Command not found

    1. Re:I tried to replicate the bug, but all I got was by teslar · · Score: 5, Funny

      Lucky you. I was watching Star Trek First Contact in the living room and fifteen minutes after Picard told the Enterprise computer to initiate the self-destruct protocol, my laptop exploded!

    2. Re:I tried to replicate the bug, but all I got was by Overzeetop · · Score: 4, Funny

      It's not Vista's fault your laptop uses a Sony battery. MS can't be blamed for everything, you know.

      --
      Is it just my observation, or are there way too many stupid people in the world?
  5. amusing, but not much else by Thansal · · Score: 2, Insightful

    If you computer starts spitting out voice commands, just create another sound that will interupt it.

    Admitedly all I can think of is the Dilbert cartoon with Wally getting ticked at Dilbert having voice driven software.

    --
    Do Or Do Not, There Is No Spoon, There Is Only Zuul. Everything in the above post is probably opinion.
  6. Bug? by drinkypoo · · Score: 3, Insightful

    I wouldn't call it a bug. I'd call it a very bad idea to use a microphone without a switch for voice recognition. Your television could theoretically do things on your computer. Does that sound like a possibility you want to entertain? Get a mic with a switch, or get rooted.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  7. The Real Agenda of this Article? by ksalter · · Score: 4, Insightful

    All voice recognition software, no matter what platform, would suffer from this supposed "exploit". So why this article on Vista specifically? What is the real agenda here? Also, if the voice recognition software is trained for a specific user's voice, the chances of an exploit are reduced.

    1. Re:The Real Agenda of this Article? by shark72 · · Score: 2, Insightful

      "All voice recognition software, no matter what platform, would suffer from this supposed "exploit". So why this article on Vista specifically? What is the real agenda here? Also, if the voice recognition software is trained for a specific user's voice, the chances of an exploit are reduced."

      Yup, this is an old one. There's an apocryphal tale of a user group meeting from long ago of a vendor demonstrating voice-control software and a smart aleck in the back of the room yelling "DEL *.*!" (or whatever the MS-DOS command was).

      As you implied, the agenda is, of course, to have a laugh at Microsoft's expense. If they hadn't included voice control software, the opportunity would have been to point out that Microsoft spent $BIGNUM person-years working on Vista and didn't even include that feature. OSX's easy access to a shell prompt with root access is about as relevant an exploit as the voice control exploit, and the odds of a cat wandering into my house and walking on the keys in such a way to generate the wrong "rm" command are about the same as this Vista "exploit" happening to me. But, it's aways fun to have a laugh at Microsoft's expense, isn't it?

      --
      Sitting in my day care, the art is decopainted.
    2. Re:The Real Agenda of this Article? by 99BottlesOfBeerInMyF · · Score: 4, Informative

      All voice recognition software, no matter what platform, would suffer from this supposed "exploit". So why this article on Vista specifically?

      This is untrue. Speech recognition software can be made to filter out anything coming in the mic that matches something going out the speaker channel. More simply, you can simply require all commands be preceded with an arbitrary word (like the computer's name). Call you computer "George" and then issue the command "George, kill dash nine star dot star." As opposed to "kill dash nine star dot star." Since the exploit writer won't know to include "George" their exploit fails almost all the time. This was a feature of MacOS 7, more than a decade ago, as I mentioned elsewhere.

      Also, if the voice recognition software is trained for a specific user's voice, the chances of an exploit are reduced.

      Depending upon the tolerance, this is entirely possible, but I don't see it as being as important or versatile as the other two methods I listed above. MS should have learned from the example of others.

    3. Re:The Real Agenda of this Article? by billcopc · · Score: 4, Funny

      Voice control is fine, but having the computer react to its own output is ludicrous! You'd think Vista would be smart enough to recognize feedback... It's like having a retard talking into a mic that's hooked up to his own headphones.

      Bob: "Bob go jump off a bridge"
      Bob: "Who said that ?"
      Bob: "I said that. Now jump!"
      Bob: "Ok.. Aaaaaaaagh!"

      Stupid.

      --
      -Billco, Fnarg.com
    4. Re:The Real Agenda of this Article? by xoyoyo · · Score: 2, Interesting

      True, all speech recognition software *would* suffer from this exploit if the application designers hadn't thought about the likely scenarios in advance. I just checked the situation with my Mac, which comes with speech recognition built in (and has done since what, Mac OS 9?)

      Nothing destructive is enabled by default: the worst you can do on a Mac is log yourself out, but that will keep everything running as it was before.

      If you go to the Speech control panel you can, after putting your admin password in, enable Menu Bar actions which allow you to do things like trash files and restart the computer.

      So by default the computer will just do helpful stuff, but if you really need full control over the OS through speech recognition (eg, you are disabled) you can enable it.

      It's a good indicator of the different philosophies between the two OS vendors we also see in their approach to networking (this may have changed with Vista, I've not really been following it): Apple shut down everything by default and requires the user to open ports; windows boxes, on the other hand are wide open from first boot, have to have their ports shut down by a knowledgable user.

    5. Re:The Real Agenda of this Article? by Khabok · · Score: 2, Informative

      My Mac requires a keyword before accepting voice commands. Does Vista do this? If not, I'd call it a vulnerability, albeit a minor one.

      Maybe they should ask the user for a keyword without offering a default? But how many people would use "computer" anyway?

    6. Re:The Real Agenda of this Article? by krakelohm · · Score: 2

      "Hi, my name is Werner Brandes. My voice is my passport. Verify Me."

      --
      You are all a bunch of idots.
    7. Re:The Real Agenda of this Article? by mrbcs · · Score: 2, Funny
      I had this problem years ago. I was playing with something called verbex.. talk to your computer... it does stuff. You had to train it. It worked fairly well and freaked out visitors. I had it set up so that if I swore at a program (it was windows 95 after all) the computer would do an alt+F4. Funny stuff... until one day I'm on the phone and getting a bit... ummm.. upset. Apparently I was cursing a lot cause when I turned around.. my computer was off.

      I removed the software after that.

      --
      I'm not anti-social, I'm anti-idiot.
    8. Re:The Real Agenda of this Article? by man_of_mr_e · · Score: 2, Informative

      You do realize that it's a bit more complicated than that. Depending on where the speakers were from the microphone, any reflective surfaces that might bounce the sound back, etc... it can all fuck up a noise cancelation circuit, which is what we're basically talking about.

      I've seen EXPENSIVE noise canceling speakerphones screw this up.

    9. Re:The Real Agenda of this Article? by djh101010 · · Score: 2, Informative

      "OSX's easy access to a shell prompt with root access"

      Really? How do I get a shell prompt on a Mac with root access without typing my password?
      I notice he hasn't responded to this. I'm thinking it's because, well, there isn't an easy way to do it. In fact, I can't think of a _hard_ way to do it. Maybe an SUID script to open it as root, but then you have the display thing to deal with. Hm... more likely he was just talking out his arse.
    10. Re:The Real Agenda of this Article? by planetmn · · Score: 2, Insightful

      Except that it will never match. You are basically doing a D/A conversion to output the sound via the speakers, and then A/D when using the mic for input. Both of these stages will cause some distortion (lots of distortion with crappy speakers and microphones). Furthermore, the acoustical environment is going to affect different frequencies to different extents.

      For instance, the mic may not pick up any of the low frequencies due to location of a subwoofer, quality of speakers, sound absorbers (carpet, etc.). So in order to match the output to the input, you need to allow for these factors and by the time that you give yourself enough of a margin, you've in effect taken out all functionality.

      Sure, it's fun to bash MS here on slashdot. Just don't let reality get it the way.

      -dave

      --
      /., where "Apple and Google provide Iran with nukes" will be refuted with "But Microsoft is a convicted monopolist"
    11. Re:The Real Agenda of this Article? by moofo · · Score: 2, Interesting

      It worked pretty well in Mac OS 9. You could login to the machine by selecting your username and then saying a passphrase. The default was: "My Voice is my password"

      Thing is, it was local accounts only, no directory system at this point, much less for voiceprints !

      --
      "I've heard nonsense, compared with which that would be as sensible as a dictionary." Through the looking glass and what
    12. Re:The Real Agenda of this Article? by adolf · · Score: 2, Insightful

      All true.

      However, this should be a solvable problem with current DSP technology.

      If my cellular telephone can perform realtime echo cancellation, and subtract its own speakerphone audio from the microphone audio, and do it for several hours at a time on a battery the size of a matchbook, then I can only fucking hope that a modern dual-core machine would be able to tackle the task handily.

      Even after the variables are all multiplied by some factor because the speakers might move relative to the microphone, there seems to be plenty of horsepower available to throw at the problem. The fundamentals have all been solved by folks like Bell Labs, US Robotics, and Polycom a long fucking time ago, with less DSP power than my $20 optical mouse, using the widely variable POTS network as a testbed, where even the -remote- handset affects the quality of your own voice on the line.

      Just because there's layers of distortion, band limiting, spurious external noises, with dynamics and delay possibly being anywhere on the map and an echo signature that changes as people move around the room, does not mean that it's not all measurable, quantifiable, and possible to reduce it to acceptable levels.

      Remember, you don't have to get rid of all the feedback, and it doesn't have to be perfect. We're talking about a limiting computer's ability to hear itself, which is a far easier task than anything involving a human being. You only have to get rid of enough that the computer does not respond to its own voice. And also, remember that the resultant quality of the recorded microphone audio need not be production-grade, but only good enough for the computer to understand human-generated voice commands.

  8. Format by jlebrech · · Score: 3, Funny

    "Open Terminal For Matt See Yes Im sure Reice Tart!!"

  9. I'm waiting for the audio exploit that responds to by StressGuy · · Score: 2, Funny

    the phrase "Simon Says"

    --
    A goal is a dream with a deadline
  10. A Whole Decade of Nothing by 99BottlesOfBeerInMyF · · Score: 4, Interesting

    More than ten years ago I was playing with the speech recognition software that shipped with MacOS 7 or something and I though being able to check my e-mail without getting out of bed was pretty cool. At the time I wrote something about the technology and predicted that speech activated commands would never take off until: 1, most audio people listened to was controlled by the computer, and 2, the computer was smart enough to filter out the sounds it was emitting before processing commands. At the time a lot of people listened to music from their computer and I imagine many still do. Why can't the computer ignore all that sound? It knows it is outputting it so why not filter it? It is sad that the same missing feature is still a problem, so many years later.

    1. Re:A Whole Decade of Nothing by xappax · · Score: 4, Insightful

      Why can't the computer ignore all that sound? It knows it is outputting it so why not filter it?

      The sound that is output by the computer sounds similar to us when re-received through the mic and played back, but to the computer it's a totally alien waveform. A lot of distortion happens between when the computer sends a digital signal to the sound card and when it receives an analog signal from your microphone - so basically, the computer may know what it's playing, but it has very little idea how it'll sound when it reaches the mic.

      There are advanced filters and algorithms that can try to match and isolate particular patterns and "sounds" within a waveform, but they're not nearly as powerful as CSI would have us believe, and they also require far too much computing power to be run in realtime.

      Of course, the obvious low-tech solution to this issue is to wear headphones, as people in recording studios have for decades.

    2. Re:A Whole Decade of Nothing by Jerf · · Score: 4, Insightful

      The easiest answer to this question is, try it.

      Most simple schemes people come up with to address this are perfectly doable with a free sound program. Play some music, record the area while you're playing the music, then try your great idea. Like, you might think you can start out with inverting the source file and feeding it into the recording with a delay and modified amplitude. If you're really curious about this problem, this is a better way to learn about the difficulties then reading people on the internet, as, in my experience, you're quite likely to be skeptical about the explanations anyhow. The best (and in some sense, only true) explanations involve a lot of math.

      I can offer you this meta-rule, though: If it were so easy, it would already have been done. Many things that I see people posting on Slashdot about "Why don't they just do this thing?" are covered by this rule.

    3. Re:A Whole Decade of Nothing by fwr · · Score: 2, Insightful

      I call bull. What about that "echo cancellation" feature you find on all the popular web cam software? What about all the collaboration software out there that has echo cancellation? The basic premise is that if you don't use headphones and instead the computer speakers then the mic will pick up the sounds that the computer is transmitting from the other side, and you'll get an echo. Saying that it requires far too much computing power is incorrect. While it probably won't make it totally disappear, it will reduce the incoming signal from the mic to a level such that the voice processing feature on the computer won't be able to make out any of the commands. "totally alien waveform" right. Tell that to Sony and their noise cancellation headphones. If they can fit the technology in a headphone then a modern computer capable of running Vista certainly has enough horsepower.

  11. this makes for some fun sound files by SashaMan · · Score: 2, Funny

    website sound: "All your base are belong to us"
    Vista: "Do you want to reformat your hard drive?"
    website sound: "All your base are belong to us"
    Vista: "Are you sure you want to reformat?"
    website sound: "All your base are belong to us"
    Vista: "Reformatting.........."

  12. Shit... by thousandinone · · Score: 5, Funny

    I just watched 2001: A Space Odyssey on my machine... this may be my last post.

  13. Nothing new here by Ruprecht+the+Monkeyb · · Score: 5, Funny

    Years ago when I worked in a shop that used OS/2 (one late version of which included speech recognition), we used to play pranks on each other all the time using that 'feature'. Things like changing a startup sound to be two minutes of silence followed by a verbal shutdown command, or changing confirmation prompt sounds to be 'cancel'. Good fun. The random 'select all / delete / yes' was the best, though.

  14. or by www.sorehands.com · · Score: 3, Informative

    The geek watching Andromeda. "Fire all missles"

  15. Hey, no need to panic... by Bertie · · Score: 3, Informative

    I mean, look:

    "Microsoft has said that even if the machine was primed to accept voice commands it would be unlikely the user would not be in the room to hear the file with malicious instructions being played."

    Yeah, nobody ever leaves their computer unattended.

    And of course, it would be completely impossible for a Trojan to pipe appropriate sounds directly to the input buffer of the sound hardware, thus negating the need for it to be played through your speakers at all. As we all know, Windows is completely watertight against that sort of thing.

    This raises an interesting possibility, though - what if you could confuse the recogniser itself into making false positives? You could, for example, persuade it to recognise silence as a command of your choosing.

    Best way round this is probably to prevent people doing potentially destructive operations via voice commands. But if this isn't suitable, you could employ clever confirmation strategies, like "If you're sure you want to delete c:\windows, please say the following words..." with the words in question being drawn from a dictionary. No malware could anticipate the sequence (although I suppose you could set the recogniser to work against itself, by playing the text-to-speech engine's own output back to it and triggering recognition).

    Hmm. Promises to be quite fun, this.

  16. howto for Mac users by sootman · · Score: 4, Informative

    to create malicious audio files with OS X (10.3 or later), fire up Terminal and use 'say':
    $ echo "format sea slash you" | say -o evil.aiff
    This makes your messages with a nice, clear, even voice--wouldn't want a bunch of 'um's and 'ah's borking up your exploit, now would you. :-)
    `man say` for more options.

    --
    Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
  17. Fraternity Fun by Zerth · · Score: 5, Funny

    If they don't prevent them from running arbitrary commands, you know 5 years in the future that every time term end comes around there will be some naked freshman running through the uni library/labs shouting "quit without saving! yes! reboot! yes! shutdown -h now!"

  18. Yakking by SilverJets · · Score: 2

    As my coworker said when I told him about this, "That's not hacking it's....yakking!"

    (Or yacking for those who prefer the alternate spelling)

  19. We've been waiting for this (and joking about it) by Qbertino · · Score: 5, Funny

    Me and my friends have been waiting for this and joking about it since IBM Via Voice and Dragon Speak. A whole new era of IT pranks and cyberterrorisim awaits us. Imagine bursting into a room full of PCs and yelling

    "FORMAT DRIVE C! CONFIRM!".

    Instant fun.
    Makes me feel all soft and gooshy inside just thinking of it. :-)

    --
    We suffer more in our imagination than in reality. - Seneca
  20. Re:In One Ear and Out the Other by itsme1234 · · Score: 2, Insightful

    "Vista should be testing its incoming audio to detect whether it matches any outgoing audio that Vista is playing."

    I guess you never saw a room with more than one computer in it.

  21. Predictions from the past ... by Gopal.V · · Score: 4, Funny

    Userfriendly had predicted the fate of voice recognition six years ago - rm -rf / and yet again !.

  22. Maybe a good start, but not that easy by mopslik · · Score: 2, Insightful

    Vista should be testing its incoming audio to detect whether it matches any outgoing audio that Vista is playing.

    I imagine it's not quite so straightforward. You'd need to take into account room acoustics, hardware effects, generic ambient noises, or even other interfering sounds in the same room that could all interfere with a comparison of outgoing sound to incoming sound. It's very rare that you'd ever have a time where your outgoing sound file exactly matches one that is sensed coming from the speakers.

  23. Shocked! by Andrei+D · · Score: 3, Funny

    I am shocked! Damn you Bill, I really believed you when you said Vista is "dramatically more secure than any other operating system released". My world view is turned upside down now :(

    --
    We often refuse to accept an idea merely because the tone of voice in which it has been expressed is unsympathetic to us
  24. Best. Prank. Ever. by copponex · · Score: 3, Funny

    Find office with 10 or 15 stations with shiny new copies of Vista. Verify through other means that mics and voice commands are on. Run in, and yell as loud as you can the commands that will shut down the machines. Don't run out yet!

    Watch people panic at their keyboards. Listen to their gasps as the hard disk spins down and their monitors cut off, at which point they all stare at you. Wave. And then run.

  25. Re:OS X? by gkearney · · Score: 2, Interesting

    I tried this on MacOS X version 10.4.8 (the latest version) I could not make the mac respond to voice commands being played from the speakers or from patching the sound out into a iMic. Here is what I did.

    1. Ran the voice command option and configured it as apple suggests.
    2. Made sure that the voice command understood my command by issuing several and getting the correct replys back from the system.
    3. Recorded the command "What time is it?"
    4. Played back the command with voice commands on.

    The mac did not respond. I then tried the same thing with a patch cable between the output and a iMic USB audio adapter. It still would not respond from the recording bout will respond to my voice. I have no idea how Apple is able to distinguish where the voice is coming from.

  26. I'm feeling anal today, so ... by spellraiser · · Score: 4, Insightful

    An exploit is, by definition, a successful manipulation of a bug/omission/hole/whatever in a computer system to make it perform something that it was not designed to do. Usually this term is only applied when said action is harmful or potentially harmful.

    What is being described here is the possibility of controlling the voice recognition system in Vista remotely to make it perform potentially harmful tasks. Furthermore, this functionality is not something that said system was designed to do; it was only designed to accept commands via microphone.

    Therefore, what is being described here is an exploit.

    Q.E.D.

    --
    I hear there's rumors on the Slashdots
  27. "Hi, I'm a Mac..." by starglider29a · · Score: 2, Funny

    The Vista replies, "And I'm a PC."

  28. Bah... by eno2001 · · Score: 5, Funny

    I expect someone to come up with a site that says:

    "Start Internet Explorer"
    "Go aytch tee tee pee colon slash slash gee oh ay tee ess ee dot see ex"

    Brrr...

    --
    -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
  29. MS Security Response Blog: Adrian responds by davidwr · · Score: 4, Informative

    Adrian responded to this on the Microsoft Security Response Blog.

    Issue regarding Windows Vista Speech Recognition

    Hey everyone this is Adrian and I am writing to try and clear up some concerns regarding a recently reported vulnerability in the Speech Recognition feature of Windows Vista. An issue has been identified publicly where an attacker could use the speech recognition capability of Windows Vista to cause the system to take undesired actions. While it is technically possible, there are some things that should be considered when trying to determine what the threat of exposure is to your Windows Vista system.


    He goes on to list reasons why this is not a major issue. The first being that voice commands have to be turned on and configured for this to work.

    He ends with

    While we are taking the reports seriously and investigating them accordingly I am confident in saying that there is little if any need to worry about the effects of this issue on your new Windows Vista installation.

    I think he's right. If this was a serious problem, the MacOS and OS/2 "exploits" mentioned above would've received a lot more press. Still, I expect in a future version, the voice software will be smart enough to ignore the computer's own output.

    Personally, I don't like voice commands. They are necessary for users with certain impairments and useful for certain applications such as kiosks, but they are counterproductive in a shared-office environment and just plain weird on my desktop. Even on Star Trek - The Next Generation much of the computer input was via control consoles not voice.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  30. Re:Restart? Really? by inquisitor · · Score: 2, Insightful

    It's not necessary to restart the PC to turn off speech recognition - just say "stop listening" or click on the always visible recognition toolbar to turn the microphone off. It's also not on by default either, and only those interested in it will find it anyway. Not really an "exploit" that's actually exploitable.

  31. Brilliant! by Kozar_The_Malignant · · Score: 2, Insightful

    The security advice is "A user can turn off their computer speakers..." before playing an audio file. We can also solve the problem of porn getting into our school network by unplugging the monitors. I didn't realize this security stuff was so easy.

    --
    Some mornings it's hardly worth chewing through the restraints to get out of bed.
  32. Startup Sound by EricJ2190 · · Score: 5, Interesting

    Now I see why Microsoft doesn't want you to change the Vista startup sound.

  33. Prior art by hweimer · · Score: 5, Funny

    Time to quote a usenet classic:

    Last year, out in California, at a PC users group, there was a demo of
    smart speech recognition software.

    Before the demonstrator could begin his demo, a voice called out from the
    audience:

    "Format c, return."
    "Yes, return."

    Damned short demo, it was.

    --
    OS Reviews: Free and Open Source Software
  34. You'll know your company is now a botnet... by sprior · · Score: 4, Funny

    When your machine room starts doing a gregorian chant...

  35. Filtering? by phorm · · Score: 2, Informative

    No, actually it isn't really agendized.

    Ever used a program such as skype or other voice-chat software? Notice when you have speakers and microphone on, you generally don't hear your voice constantly repeating into echoes (if echo-cancel is on, of course). Notice that you don't with the speakerphone on your cell either? That's because the software/hardware is smart enough to take the audio output and subtract/prevent it from entering the audio input (avoiding feedback loops etc). If used properly with voice-recognition software, it would defeat programs on a webpage from sending output to be re-picked up from your input system. Since MS assumedly has control over the audio subsystem of the operating system, it should be able to snag the master combined output and filter it in this way.

    Now that doesn't preclude some annoying twit from walking by and telling your computer to do things it shouldn't. However, that issue could be prevented by engraining an element of "speaker recognition" (the person speaking, not the ones on your computer) to the machine. Further, it could require a user-defined prefix or suffix to the command, such as "Computer, earl grey tea, hot!" or "Open the doors, Hal!"

  36. Speech Researcher Here Confirms It by virtigex · · Score: 3, Interesting

    I have worked on both at Apple on PlainTalk and at MS Research on speech. When I was at Apple (around 1996) I poked my head into a co-worker's office who was testing PlainTalk and said loudly "Computer Shut Down". His computer then started shutting down. This "exploit" has been on the Mac since 1996 and nobody seems to have complained about it. I don't think it's a big deal.

  37. There's An Even Simpler Solution by sweatyboatman · · Score: 2, Interesting

    If the computer thinks you're saying a command, it should disable output to the speakers. If I am talking to my computer then it should stop making its own noises. Otherwise, that's just rude.

    --
    It breaks my pluginses, my precious!
  38. Saying it is unfixable is a copout by mattr · · Score: 2, Interesting

    Detection of whether a given sound is what was just emitted from the speaker may be very difficult, but it is relatively easy in terms of timing. So long as the system knows how much lag time is present in the system, it should be possible to disable detection of all sound that is being played at the same time (i.e. basically turn off the mic then). Nobody expects voice recognition to work when music or other sounds are playing, and the system, whether Vista or OS X, ought to be able to disable voice recognition instantaneously when sound output is generated.

    The problem of course is that the computer next to you might suffer from the exploit since it doesn't know what sound your computer is generating, though this might be diminished by subtracting other sound to some extent via sidepointing mics or even better by just refusing to do dangerous commands like format or delete via voice recognition in the first place. There are gray areas that probably make total safety impossible but some common sense things including disabling all recognition during sound generation from explorer and wmp sound like a good place to start.

  39. Next Mac Ad is even better by jgc7 · · Score: 5, Funny

    PC: Hi I'm a PC
    Mac: and I'm a Mac
    PC: I have a cool new feature called voice control.
    Mac: That is stupid. I have the Time-Machine which let's you recover old documents. Let's say you accidently delete the documents folder
    PC: Okay
    Mac: To get you documents back, all you have to do is slide the time machine back one minute.
    PC: Sounds cool, but cant you just get the documents out of the trash?
    Mac: Yes, but it works even if you accidentally empty the recycle bin

    --
    70% of statistics are made up.
    1. Re:Next Mac Ad is even better by curunir · · Score: 4, Funny

      Better yet, the next Mac ad could make light of this exploit.

      PC: Hi, I'm a PC.
      Mac: and I'm a Mac.
      PC: Now that I run vista, I can accept voice commands!
      Mac: Wow, that sounds cool. But what if someone tells you to punch yourself in the face?
                PC punches self in the face and nose begins to bleed
      PC: Ouch, that hurt!
      Mac: I'm sorry PC, I didn't realize that just telling you to do something like "poke yourself in the eye"...
              PC pokes finger into his eye
      Mac: ...or "begin sneezing incesantly"...
              PC starts to uncontrollably sneeze, the blood from his nose splattering everywhere
      Mac: ...would make you actually do it.
      PC: groan I'm sorry if I splattered on you.
      Mac: That's ok PC, I'm pretty immune to viruses, so I think I'll be alright.

      --
      "Don't blame me, I voted for Kodos!"
  40. Re:What is the Vista Equivlent by Viceroy+Potatohead · · Score: 2, Funny

    Nobody's really sure, but it happens with surprising regularity.

  41. Thanks for the inspiration! by Em+Adespoton · · Score: 3, Funny

    PC: Hi I'm a PC
    Mac: and I'm a Mac
    PC: I have a cool new feature called voice control.
    Mac: That is stupid. I've had secure voice control for years
    PC: Yes, but with your primitive voice control, the statements had to be in the right format, see?
    Mac: OK, but that's why we call it secure. The user has to select a keyword that will trigger the commands.
    PC: ... Mac: I hope he has his XP install CD handy....

  42. Or... by Greyfox · · Score: 3, Funny

    PC: Hi! I'm a PC!
    Mac: And I'm a Mac!
    PC: I have a cool new feature called Voice Control!
    Mac: FORMAT C!

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  43. Re:More difficult but better... by wirelessbuzzers · · Score: 3, Funny

    Am I the only one who thought "Nam-shub of Enki" when I read this?

    Yes.

    --
    I hereby place the above post in the public domain.