Slashdot Mirror


Consonants Not Required

billybob2001 writes: "A report at the BBC explains how voice-control of computers can be more successful using grunts and sighs, as "voice recognition programs often failed to accurately capture words". Dr Takeo Igarashi, of Brown University suggests the use of "ahhhh" for skipping tracks on a cd, or adjusting tv volume, but I wonder what the effect would be on pr0n sites? Another suggestion is "uh oh" for undo. Perfect for online banking. Is this going to confuse your system or what?"

60 of 139 comments (clear)

  1. Undo command by Shafalus · · Score: 5, Funny

    Surely "Ah, shit!" is the obvious choice for an undo command?

    --

    Linux advocates are in a no Win situation

    1. Re:Undo command by JJ · · Score: 2

      I think that would be more appropriate as a full "Mission abort!" or "Disconnect." command.

      --
      So long and thanks for all the fish . . . !!!
    2. Re:Undo command by FortKnox · · Score: 2

      No... that's the window command for "reboot".

      --
      Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
    3. Re:Undo command by VivianC · · Score: 3, Funny

      It sounds like they are trying to give control of my computer to the Teletubbies!

      Now my 16 month old will be able to run my machine!

      --
      Viv

      Gmail invites for ip
  2. Great use in showers! by ymgve · · Score: 2, Funny

    Now, whenever you yell you YIEEEE! in a shower because the water is too hot or cold, it will immediatly switch to a more pleasant temperature!

    1. Re:Great use in showers! by HiQ · · Score: 2

      And that would be hotter or colder? That must be one hell of a clever shower to decide on one and the same yell if you mean "too hot" or "too cold". Knowing the state of most household technololgy, when you yell "YIEEEE" (too hot), your shower will undoubtedly give you hotter water, after which you can peel your skin right of you're back.

    2. Re:Great use in showers! by Magumbo · · Score: 2

      Well the best solution to this is to get rid of your giant water heater and replace it with one of the flash heaters with a digital temperature
      control. These are really common in Japan and Hong Kong (surely elsewhere too). They are more economical, give you water heated to your desired temperature almost instantly, and you never run out of perfectly heated water.

  3. Help Desk by well_jung · · Score: 5, Insightful

    Anyone that's worked at a Help Desk should know that Users have been trying this for years.

    --
    Carl G. Jung
    --
    "With one breath, with one flow, You will know Synchronicity" -La Policia
    1. Re:Help Desk by SilentChris · · Score: 2
      Does it handle expletives?

      Can you imagine the Microsoft ad? "Now talk to the computer *the way you've always wanted to*. IntelliSense handles all forms of four letter words..."

  4. Well no shit. by BiggestPOS · · Score: 2, Funny
    But this isn't what I dream about doing on the bridge of the Enterprise D. Instead of saying "Computer, Tea, Early Grey, Hot" I'd say something like "Oooh, Ahhhh, Grrrr"

    I dont think so.

    --
    What, me worry?
    1. Re:Well no shit. by zephc · · Score: 2

      more like
      ooo-errr! eee! errr aeee! ahhh!
      =]

      --
      "I would say that 99 per cent of what my father has written about his own life is false." - L. Ron Hubbard Jr.
  5. Dangerous, surely? by iainl · · Score: 2

    Surely this could really backfire. I'm just finishing up an important document, perhaps having a significant section of text highlighted as I move paragraphs around.

    "Sorry, I couldn't get that disc you were after today" says a collegue.

    "Ah, shit!". Oops, there goes a bunch of your document. Don't swear, though, or you'll lose it from the undo buffer as well!

    --
    "I Know You Are But What Am I?"
    1. Re:Dangerous, surely? by hoggoth · · Score: 3, Funny

      Be careful with this!
      I can just see it now. You are recounting a traffic accident to a college:

      You: "I rammed a sheriff!"
      Computer: "Executing: rm -(dash)rf"

      --
      - For the complete works of Shakespeare: cat /dev/random (may take some time)
  6. Re:Undo command - another possibility by Ed+Avis · · Score: 5, Insightful

    D'oh!

    --
    -- Ed Avis ed@membled.com
  7. Self Destruct by stinkydog · · Score: 3, Funny

    Just don't say Mua'dib or the computer explodes.

    -He has the weirding way.

    --
    âoeWho knew something as harmless as willful ignorance could end up having real consequences?â
  8. It's cute, but... by d5w · · Score: 5, Interesting

    The computer can't distinguish words easily, so we'll give you a potentially much smaller vocabulary and see if it does better? Of course it'll do better, whether or not that smaller vocabulary contains consonants.

    What I'd worry about is whether these unarticulated sounds sound more like background noise than articulated speech; if so, then you've made the situation worse by making it harder for the computer to know when you're talking to it.

    On "uh oh": Dragon Dictate (discrete speech recognition from a few years ago) used "oops" for telling the SR system when it made a mistake; it was reasonably easy to distinguish from words that you actually wanted to put into your text with any frequency.

    1. Re:It's cute, but... by dollargonzo · · Score: 2, Interesting

      well, yuo are actually not quite correct on the consonant thing. ever try doing an FFT on some sound, and keeping only the major frequencies? we humans hear consonants, but for example p and b are essentially the same thing. and in the case of say, an S, its sound like noice to the computer, making it harder to distinguish than when an AAA makes one distinct frequency. So, although yuo are correct in saying that a smaller vocabulary would help, not as much as removing cononants.

      --
      BSD is for people who love UNIX. Linux is for those who hate Microsoft.
    2. Re:It's cute, but... by plastik55 · · Score: 3, Interesting
      FFT is exactly the wrong technique for resolving transient or plosive sounds. Wavelets work better. Take the CWT of a person speaking, and you can *see* the shape of all the consonants.

      When people speak, it is the consonants that matter. Ever try listening closely to someone with a pronounced regional accent? The vowels are all jumbled up but the speech is still intelligible. IIRC, people tried to teach gorillas to communicate using different grunts, and gave up in favor of sign language. Reason being that you *can't* string two different vowels together without a consonant in between and have it be intelligible.

      --

      I have a positive modifier on Troll. When I mod someone Troll their karma should go UP!

  9. Just so you know... by tswinzig · · Score: 2

    The letter 'h' is a consonant.

    --

    "And like that ... he's gone."
  10. No, this is serious academic research! by Anonymous Coward · · Score: 3, Interesting


    Seriously. I have colleagues that work on this type of thing:

    "Sound Symbolism in Conversational Grunts in English"
    "The Challenge of Non-lexical Speech Sounds"
    "Issues in the Transcription of English Conversational Grunts"

    http://www.sanpo.t.u-tokyo.ac.jp/~nigel/publicatio ns.html

  11. If the speakers are aimed at you by wiredog · · Score: 2
    Then you explode.

    This assumes you are talking about the Muad Dib in the movie, and not the one in the book. All that weirding module stuff isn't in the book. The "weirding way" is basically Super Ninja fighting techniques. Paul was taught by Jessica.

  12. Tim Allen will love this by wowbagger · · Score: 3, Interesting

    Of course, many have said that the GUI is a "caveman interface" - point and grunt, err, click.

    This really strikes me as the verbal equivelent of Palm's Grafitti - if normal interactions (printing/speaking) is too hard, make a simplified interface (Grafitti/grunting) that isn't.

    I don't know, but I already learned one interface (typing) to make my computer's life easier. Why should I do all the work?

    1. Re:Tim Allen will love this by bay43270 · · Score: 2

      I don't know, but I already learned one interface (typing) to make my computer's life easier. Why should I do all the work? Exactly! Wasn't the whole point of voice recognition to make computers interact with humans the same way we interact with each other? Lets be realistic... the reason Palm uses Graffiti is because the keyboard was too small to use... not because it recognizes handwriting so well. Graffiti does not satisfy the goals of handwriting recognition, and this technology does not satisfy the goals of voice recognition.

  13. I know what I want ... by HiQ · · Score: 2, Insightful

    I don't believe in the necessity of a voice operated computer. At the risk of reopening a very old discussion, a good command line will do better in most cases. It takes far less time (for a skilled person) to use a command than to explain the desired action in 'normal' language to a computer. I mean 'rm -r /*' is typed in a lot faster than saying: "Go to the root directory and delete every file, including all subdirectories".

  14. In Related News, Code Sex Virus Released by Myriad · · Score: 2
    In related news, police have closed in on a suspect believed to be responsible for creating the Code Sex virus that crippled thousands of systems across the net last week.

    When asked about the virus the unidentified man responded "It's not my fault! I didn't to it intentionally. All I was doing was surfing my favorite pr0n sites and, well, you know, enjoying myself, when all these windows started popping up! At first I thought it was the usual spam trick - but no, this code just started appearing everywhere. It just sort of created itself... really! You've gotta believe me!"

    The investigation continues.

    --
    "They do not preach that their god will rouse them, a little before the Nuts work loose." Kipling, 'The Sons of Martha'
  15. Ooo...eee.. by thewiz · · Score: 2, Funny

    "Ooo eee ooo ah ah, ting tang walla walla, bing bang"

    A line from "The Witch Doctor" by David Seville or voice command to shutdown Windows? Decide for yourself by playing it for your voice recognition software.

    --
    If "disco" means "I learn" in Latin, does "discothèque" mean "I learn technology"?
    1. Re:Ooo...eee.. by wirefarm · · Score: 3, Funny

      "Ooo eee ooo ah ah, ting tang walla walla, bing bang"

      The verbal equivalent of perl?

      Cheers,
      Jim in Tokyo

      --
      -- My Weblog.
  16. singing ditties for commands by peter303 · · Score: 3, Funny

    Its easier to recognize tonal changes than constanants. Its easier for humns to use full words than isolated vowels.

  17. Not particularly useful... by -dsr- · · Score: 2, Funny

    I spent the last ten minutes with a bad case of the hiccups. What do you think that would have done to my weekly report?

  18. Ahhhh by garoush · · Score: 3, Funny

    Now there is a whole new meaning to "Yada, yada, yada, ..."

    --

    Karma stuck at 50? Add 2-5 inches.. err.. 2-5x Karmas Count to your pen1es.. err.. Karma all naturally and private
  19. right on by unformed · · Score: 2

    and they should also have a hammer that beats the shit of the computer whenever you say "Motherfucker!"

    or a little lesser violence with lesser curses. For example "Fucking A!" will just BSOD. Hey the irony itself would be funny...

  20. Technology Devolves Humans by scorp1us · · Score: 2, Funny

    After 30,000 years of having good comminucation skills, humans' finally revert to pre-historic communications skills. Their technology is responsible for thier de-cevilization. It seems a computer interface consisting of only grunts and primitive sounds was selected for windowsXP, and as a result the entire human vocabulary has reverted back to pre-historic roots.

    Bill gates said "We are proud to be responsible for the conversion to a much easier language. While XML can organize our data better, we needed a common language for human interaction. Leveraging our power on the desktop, we we able to achive this." When asked about how aliens might perceive our change of language, Gates repsonded "I'm sure that they will appreciate the simplicity more. I mean, who ever liked French and all of it's eligance anyway?"

    Grunt snort grr grr.

    --
    Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
  21. Re:Why?? by demaria · · Score: 2

    Well for general purpose operating systems like windows, linux, mac this isn't as great. In the current GUI model keyboard and mouse are superior. Perhaps if someone invents a voice controlled GUI (maybe with integrated touch screen or some sort) then you could, but the current GUIs aren't built for voice control.

    There are other applications though. For example, a car radio. Why press the buttons to find radio stations if you could tell the car "tune 95.3". It has applications on a telephone menuing system.

    But don't underestimate dictation software. There are lots of advantages of dictation. It lets you 'type' faster (assuming it's good software and you train it), and people who are disabled or have injury (broken wrist, carpal tunnel) really need it.

  22. Oook! by JimPooley · · Score: 2

    Just whatever you do, do NOT take your computer to the monkey house. It'll probably self-destruct!

    --

    "Information wants to be paid"
  23. Time for Tellytubbies! by iapetus · · Score: 2, Funny

    Over the hills and far away, Teletubbies come to hack!

    Eh-oh!

    Uh-ehn! Uh-ehn!

    Time for tubby shutdown...

    Uh-oh...

    --
    ++ Say to Elrond "Hello.".
    Elrond says "No.". Elrond gives you some lunch.
  24. Re:Know who could be... by geomcbay · · Score: 2

    The voice is meant to be a generic "Kennedy"/Hyannis accent. But the character is modelled after many politicians, real and imaginary.

  25. Bad idea from a linguistic standpoint by dasmegabyte · · Score: 5, Interesting

    Asking people to use another language when dealing with machines -- especially one that's more visceral -- is just asking for trouble. Already computers are seriously affecting the ability of humans to communicate orally, by concentrating the language into short bursts used during chats we lose the particles of sentences that help establish context in speech (yes, there is a reason for "the" and "a"). Besides, here's an oppurtunity to elleviate a lot of the bad habits that make dialectic English so tough to understand for those outside the dialect: set the machines to understand one sort of English, so that everybody has to speak at least that type along with their colloquial speech. Of course, there's always the possibility for eugenic practices with this, so my proposal is this: teach the computer the differences between the 8 vowel sounds used by people in Colorado, where pretty much every vowel approaches the schwa (the schwa being the neutral position for the human vocal system and therefore easiest to pronounce). After a while, people will realise that to be successful at using voice activated systems, they'll need to adjust their inflection, and after a while will adjust it automatically when dealing with people who don't understand them, either.

    But voice activated systems are stupid, anyway...speech is one of the slowest forms of human interaction, and is one of the few we have to actively concentrate on to perform. You know when people say, "Think before you speak?" That's because once you start speaking a large portion of your brain activity is devoted to doing so...it actually becomes harder to think about what to say next. Pressing a button or turning a dial takes practically no thought...which is another reason why a speech written in spontaneous draft still sounds better than one that is spoken aloud. If we convert machines to speach recognition, we're effectively asking people to interact with them in dumber ways. And can you imagine the logic involved with processing a fairly simple statement like "This check in my hand should be processed by you and in return i'd like fifty bucks in tens and ten one dollar bills." Since the command isn't linear, the machine not only has to recognize what each word means, but try and interpret them in queue. And if humans can't construct complicated sentences like the one above -- which any human over the age of about 4 can understand, before that kids can't identify the subject and object in complex sentences -- they'll be inconvenienced by speaking machines. Oh and for a simpler example, try this: "My pin number? 376 uhhhhhh...Forty-two thirteen...aaaaaaaaaaaand...is it six? no. Eight?...oh! oh! sixty eight!" A human can understand that...we'd be annoyed, but we'd get it.

    --
    Hey freaks: now you're ju
    1. Re:Bad idea from a linguistic standpoint by dasmegabyte · · Score: 2

      Well, despite five years of studies in rhetorical science I can still spell "fuck off."

      --
      Hey freaks: now you're ju
    2. Re:Bad idea from a linguistic standpoint by XNormal · · Score: 2

      "If we convert machines to speach recognition, we're effectively asking people to interact with them in dumber ways."

      Uh huh.

      --
      Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
  26. Background noises deleted my HDD! by glebite · · Score: 4, Interesting

    How selective would the speech recognition be? If I was playing musing on that computer, would the computer pick up the tones coming in and start "doing stuff(tm)" on my computer? What about background noises? My friend's Jello Biafra spoken word CDs?

    I won't even go there with my Saturday Morning Cartoon CD - Eep Opp Ork Ah-Ah (This means mail all of my friends a copy of my resume)...

    --
    I donate all spillover Karma to the charity of my choice... Ada was still a babe despite what people may say...
  27. What I would do... by ch-chuck · · Score: 2

    just for the heck of it, is interface the voice synthesis output of one computer to the voice recognition interface of another and start a transfer of a large text file just to see how long it takes and how accurate it is. I might get about 10-20 bps thru phone line.

    If they start standardizing on a vowel command system and people overcome the embarassment of using it, how long before SharperImage starts selling little boxes that make the same sounds at the push of a button, to, you know, make life even better?

    --
    try { do() || do_not(); } catch (JediException err) { yoda(err); }
  28. Re:Whistle of Command by HiQ · · Score: 2

    I can picture myself working behind my computer, eating cookies (or whatever), and giving the computer a whistled command, and getting up to get of box of tissues to wipe the wet crumbs of my screen. I really don't think it will work...

  29. Re:Typing vs. speech by Asic+Eng · · Score: 3, Interesting
    Any new interface requires some accomodation from the user.

    Ok, that sounds fair, but I guess you'd want to have some sort of benefit after you invest your time?

    I just don't see this sort of interface to catch on for standard applications. I mean - imagine you are in an office with 20 people grunting at their computers, the noise they make is just going to be unbearable. That's got to be worse than that annoying guy who's checking his voicemail via speaker phone. *shudder*

    From the article:

    By increasing the pitch of your voice, the scrolling speed increases. When you stop speaking, the scrolling ends.

    Can you imagine sitting next to a guy who uses this, and not have a headache after 10 mins?

  30. Re:Why?? by HiQ · · Score: 2

    Hmmm, if you owned radiostation 95.3, all it would take is to buy some advertising time on other radiostations, and just say 'tune 95.3'. Could have some serious fun with that..

  31. Wrong. by Haeleth · · Score: 2, Informative

    The letter 'h' is a letter, which is sometimes used to represent the sound [h], sometimes other sounds, and sometimes is silent.

    The sound [h] is usually considered a consonant.

  32. A potential timeline: by Noer · · Score: 5, Funny

    2020: Computers everywhere are controlled by grunts, moans, sighs, and snorts.

    2040: Computers are finally small enough that they're all embedded into our environments, but neural interfaces don't work, so we still grunt and snort into our computers, but it looks like we're just grunting and snorting in general. People use computers exclusively, and never talk to one another; thus, language is lost and we just grunt and snort a lot.

    2060: aliens visit hoping to find intelligent life, but instead find a bunch of snorting, grunting apes. They leave.

    --
    -- "Those who cast the votes decide nothing. Those who count the votes decide everything." -Joseph Stalin
  33. Turn off PBS. by Happy+Monkey · · Score: 2

    You don't want the Teletubbies on if you've got this setup.

    --
    __
    Do ya feel happy-go-lucky, punk?
  34. Won't work in New England by aredubya74 · · Score: 2, Funny
    Dr Takeo Igarashi, of Brown University suggests the use of "ahhhh" for skipping tracks on a cd, or adjusting tv volume

    As a Boston-area resident, I'd like to suggest that this choice of sound wouldn't work for us:

    "Hey paahl, gahhhttah go pahhk my caah." *CD skips 4 tracks*

    You'd figure the guy works for a New England university, he might've picked up on that. How about "y'all" instead?

    --

    RW

  35. Where's the Python foot? by ellem · · Score: 2

    "He wouldn't have written 'ahhhhh,' to skip tracks on his CD player."

    "Maybe he was dictating."

    --
    This .sig is fake but accurate.
  36. From the manual - step 1: Logging on. by CProgrammer98 · · Score: 2, Funny

    Pick up the mike and say "Waaaaaazzaaaaaaaaaappp"

    --
    And the people shall be oppressed, every one by another, and every one by his neighbour Isaiah 3:5
  37. Won't work in the South either by T1girl · · Score: 2

    We're well known for stretching every vowel into several syllables. "Well" comes out "way-uhl" and a long "I" sounds like "ah." Every time one referred to oneself, the TV or CD would start skipping around.

    "Way-uhl, Ah doan know wut Ah'm gonna do. Mah CD keeps skippin'. Wut are y'all gonna do?"

    Here we are at the peak of the greatest technological revolution the world has ever known, and this guy wants us to go back to communicating with grunts and moans.

    What would Rain-in-the-Face do?

  38. Tourette's GUI by gelfling · · Score: 2, Funny

    Clicks, wheezes, pops, random onscenities. Sounds like the way I interract with my computer NOW!

  39. Sheep by Kpechtunx · · Score: 2, Interesting

    Sound kind of like how a farmer controls a sheepdog ... - !K

  40. Witch Doctor by sharkey · · Score: 2

    Great, let's get a roomful of people trying to control their PCs, and it'll sound something like this:

    Ooo, Eee, Ooo, Ahh ahh,
    Ting, Tang, Walla walla bing bang.

    --

    --
    "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  41. Fun with Windows Users by geekguy · · Score: 2, Funny

    Steps to mess with your friends (or enemies)

    1)Install this and set it up so that this starts up when windows does.

    2) Set a sound to shutdown Windows

    3) Record that sound and set it to play whenever windows starts or whenever there is an error.

    4)loop the sound output into the input.

    5) sit back and enjoy watching them turn on there computer only for it to grunt and turn off on them.

    *note* Don't know if all of this would be possible but I just had to share this thought

    --
    -- Any comments seen here are not mine, but a mixture of alchohol and lack of sleep.
  42. Undo by Fjord · · Score: 3, Interesting

    Great. I'm almost finished my ultra-long /. post and someone ICQs me.

    "Uh oh"

    On another note, I knew a guy who worked with voice rec software where the delete-word command was "oops". Whenever he would watch another person typing and they would typo, he would instinctively say "oops". I'm guessing it's kind of how my writting went bad went I was using graffiti a lot. You get used to these quirky mannerisms you use to control the machines. Then you end up looking like a dork and annoying the people around you

    --
    -no broken link
  43. No need for this by Arandir · · Score: 2

    There is no need for this. Voice recognition already works. And it works well. And it already works with REAL words. No need to grunt, squeal of burp into your microphone.

    I first used voice recognition software with OS/2 4.0 on a P100 with 16MB. I was amazed at how well it worked. Of course, 16Mb was inadequate for dictation, but even with that puny system I had it trained in half and hour.

    There's a reason that voice recognition hasn't caught on. It's not because it doesn't work. It's because people don't want to talk to their computers. It's embarassing. It's not convenient. It's awkward to say those commands that computers need, like "arrem minus arref slash star".

    --
    A Government Is a Body of People, Usually Notably Ungoverned
  44. Sex by ZaneMcAuley · · Score: 2, Funny

    And during sex the entire house becomes a party place (lamp on, lamp off, hifi on , hifi off ....) a:D

    --
    ----- Whats wrong with this picture? http://www.revoh.org:1234/whatswrong
  45. Benefits fo speech recognition by einhverfr · · Score: 2

    I think that speach recognition as a computer interface would be very powerful for the following reason:

    In general (yes, there are exceptions), GUI's excel at bringing a greater density of information from the computer to the user, while command line technologies are better at delivering a greatly enhanced level of information density from the user to the computer. I remember trying to go from a command line FTP to WS-FTP and going RIGHT BACK because it made "simple" tasks like downloading a file to a floppy disk but as a different name and making it FAR more complicated.

    The advantage of a speach interface is that theoretically, you have at nearly as much information density going to the computer as you do from the command line, and it does not conflict with the GUI.

    Of course this argument also works for X-term...

    --

    LedgerSMB: Open source Accounting/ERP
  46. Great for Slashdotters by afree87 · · Score: 2, Funny

    You could run the voice recognition system as a vital resource, so when the system crashes, you go "[Zarking] [buggering] [smegging] Windows!" and it installs Linux automatically. Good idea, right?