Slashdot Mirror


Voice Recognition for a Techie?

kaybee asks: "I am a long-time developer, sysadmin, and general computer junkie (for fun and for work) who needs to seriously curb the usage of his hands. I'm curious as to the current voice recognition options, preferably usable on Linux and Windows. I prefer the command-line to a GUI, I prefer Vim to anything else, and I still read my email with Pine. I'd like to hear options for sending email via voice, which I hope is easy, and I'd love to hear of any solutions that allow effective coding via voice, which seems much more difficult."

7 of 102 comments (clear)

  1. Find ways to save typing effort by Anonymous Coward · · Score: 3, Interesting

    Voice recognition is good for letter-writing but bad for overall computer usage, especially in UNIX shell (incl vi and especially Emacs). Picking programs that don't require jumping all over the keyboard for basic tasks can reduce the strain. Same goes for programming syntax: Python is a lot more RSI-friendly than Perl, for example. (IMHO) Write scripts that automate routine tasks, even if it's just one line with lots of regex.

  2. mmmmmmmaudio by MobileTatsu-NJG · · Score: 3, Interesting

    "I'd like to hear options for sending email via voice, which I hope is easy, and I'd love to hear of any solutions that allow effective coding via voice, which seems much more difficult."

    I've wondered about this myself. I tend to use my computer with the headphones on. Often, I'm listening to music or.. well just plain silence, just the standard dings of Windows. I do pay attention, though, to the sounds coming from the computer. (i.e. the traditional hoo-hoo of recieving an email.) I've always wondered about what more could be done with sound to make the user more aware of the goings on with their computer, especially when a number of apps are actively working. I think I was inspired by an episode of Futurama I caught. One of the character's personalities was in the Pilot's body. The Pilot, whose personality was in yet another body was trying to describe how to interact with the ship. I remember him saying "Can you hear that faint little tone? That's the status of..".. or something or other.

    In any event, it's fun to imagine. I wouldn't mind if a soft low-volume voice were to say "You have recieved an email from: John Smith." I had a job a few years ago where that would have been a nice little feature since messages would come in that required urgent attention. My solution to the problem at the time was to use a custom filter that would specficially notify me of important messages by bringing a little window up to the surface. That was fairly annoying, though, when the computer was busy and it was slow as molasses to get the window to go away.

    --

    "I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)

  3. OSSRI, VoiceCoder by robocyberdroidbot · · Score: 3, Interesting

    I ran into this problem while working (coding) & trying to do grad school (in Comp Sci). The first point I'd make is, take a rest break (no computer use) for a while if you can. ASR isn't really there yet, & it won't help you with other things you might want to be pain-free for... seriously. That said, there is a group called "Open Source Speech Recognition Initiative" whose mailing list I'm on, but they don't have any product yet. Might get a better answer posting there, though. Or not. There's also a group on Yahoo (I think) called VoiceCoder. That's your best bet right now, although it's all about Dragon Naturally Speaking & various hacks & kludges to be able to do coding, use Dragon for Linux, etc. Dragon has been reported to run under WINE, but of course YMMV depending on your hardware, versions, etc., etc. Finally, whatever approach you try, expect it to take a good long while before you begin to approach your hand-using productivity. The technology isn't there yet, and even though I know how to improve it, I have no Ph.D. so no one would give me the $$ to do the research that could back up my claim.

    --
    nificant.
  4. A thought for coding via voice: by Ayanami+Rei · · Score: 3, Interesting

    First, find a solution that makes it easy to enter text into a GUI (gnome accessibility, WINE w/dragon natural speaking, whatever).

    Find a subset of words that are short, easy to remember, easy to say, and above all -- accurately translated by the chosen voice recognition software.

    Then create a small perl script that can take this coded input and convert it into a nicely formatted chunk of code.

    You can have different translators for different target languages... for example

    In shell programming, you might have the following:

    hash -> #
    bang -> !
    pipe -> |
    test -> [
    end test -> ]
    mark -> '
    quote -> "
    end mark/quote (keeps them balanced for shell scripts)

    for identifiers... don't name them. For example, lets' say you wanted to do this:

    #!/bin/bash
    function hello_lcase {
        HELLO = $1
        if [ -z $HELLO ] ; then
            echo "Hello world"
        else
            echo -n "Hello from "
            echo $HELLO | sed -e 's/.*/\L\0/'
        fi
    }

    you would say:


    hash bang slash bin slash bash
    new function 1
    set local 1 ref in 1
    if test empty ref local 1 end test
    then
    echo string 1
    else
    echo option n string 2
    echo ref local 1 pipe program s e d option e space
    mark s slash dot star slash back upper l back 0 slash end mark
    end if
    end function 1


    you'd run the perl script and it'd ask you:


    what do you want to call function 1: foo
    what do you want to call local variable 1 in function 1: HELLO
    what do you want to use for string resource 1: Hello World
    what do you want to use for string resource 2: Hello from

    and it'd output the script (maybe after running through indent)

    You could substitute "1" for any easily recalled mnemonic or symbol the text->speech translator is unlikely to mistranslate (in this case "foo" and "hello" would probably be fine as is)
    Then you'd get a chance to globally "refactor" your symbols and give them nice-looking names, only having to type them once.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  5. More than just the technical aspect... by Myrano · · Score: 2, Interesting

    I just did a presentation on speech recognition software for the Office of Disabilities Services at my school, and since I see that you have a lot of response on the technical aspect of it, I'd like to bring up something else: how speaking to the computer affects *you*. One of the things that most surprised me about using speech recognition is how speaking comes from a different part of the brain than typing. Composition through speech is *very* difficult to start; don't think you're going to just dive in and compose an essay or report right off the top: even if the computer can understand you, you won't be able to coherently phrase your thoughts in a truly professional manner. Speech recognition is at its best when used for email and (ironically, at least I thought) instant messaging, because these forms of communication most accurately mimic speech. I don't know how it's going to affect coding, though, 'cause I wasn't brave enough to try that (but I can only imagine it would be difficult). I just wanted to offer a slightly differerent perspective on it. It certainly seemed like I was using different neural pathways or *something*, so just remember: as much as you're going to be training the speech recognition profile, you're going to be training yourself, as well!

  6. xvoice by TheRealDamion · · Score: 3, Interesting

    xvoice is a gtk1 X application which uses IBM's ViaVoice engine to provide voice control and dictation support to arbitrary X applications. xvoice.sf.net is the url. The mailing list mainly covers issues of getting the ViaVoice libs working on modern distributions. The last release of VV was around the glibc2.0/2.1 era and most new ld.so's will struggle to execute the libraries and java dependancies. It's also fairly hard to buy a copy of VV 2nd hand anywhere and IBM appear to ignore any request to release it.

    However once you get past all of these issues (actually even running the old gtk1 xvoice becomes hard on modern dists), it works a charm. As it's X clean, you can X to any X server, be it one run under OSX or Windows, or a Sun SPARC box. You just need the mic connected to the x86 Linux box the client runs on.

    This meets your requirement for editing in vim etc. The accuracy, I found was fantastic.

  7. SpeechLion by rbrewer123 · · Score: 2, Interesting
    I have a small project based on sphinx4 that allows command and control of Linux. It is really not ready for primetime yet, but help and feedback is appreciated. I have looked into dictation (for email) with sphinx4 but have not implemented it yet.

    http://freshmeat.net/projects/speechlion