Slashdot Mirror


Using PDAs for Dictation?

SunPin asks: "I'm a writer that is 99% dependent, due to fine-motor disabilities, on voice dictation. I've been a dictation user since 1990. My preference is 'discrete' speech because of very low resource consumption and its effectively infinite flexibility. Over the years, my computer use has de-evolved to programming, FTP, email (Mozilla), word processing (OpenOffice) and Ricochet. Drop the game and there's nothing that I shouldn't be allowed to do on the go. The problem is that I can't. Back in 1990, the requirements for IBM VoiceType were: DOS, 8MB RAM, 10MB of drive space with one of those new-fangled scorching 386-16MHz processors... not exactly demanding by today's standards and, unless I'm outright wrong, not demanding by today's PDA standards. Why hasn't it occurred yet?"

"In the disability offices of the hundreds of universities across the US, such software would be a major money saver because not all students need a high-powered laptop. While natural speech is great from a marketing perspective, it is simply impractical for general use and cannot adapt to mildly noisy environments. IBM, L & H and Microsoft have all given me the run-around. IBM refused to entertain the possibility. L & H is on life support, in a deep coma. Only Microsoft had a remotely positive response saying that they were testing natural recognition in Mandarin Chinese in their Beijing research office. Does anyone believe in keeping it simple, anymore?"

17 of 302 comments (clear)

  1. Well... by acehole · · Score: 3, Interesting

    The reason for this can be put down to a couple of reasons.

    First off, buying a dictaphone is still much cheaper than a PDA with software.

    And secondly the whole voice/word recognition program market hasn't really accomplished any great leaps or bounds over the past five years, not to mention it's not popular in the mainstream yet.

    --
    Be you Admins? nay, we are but lusers!
  2. More to do with perception by zanerock · · Score: 5, Interesting

    I think it has more to do with the perception of voice dication as unreliable and resource intensive rather than any actual fact, as the poster points out, it can be done fairly cheaply.

    I have not had much experience, but I think the other thing is that people are averse to any sort of training or teaching required, no matter the long term dividents.

    Like most things, it comes down not to fact, but to perception and prejuidice. Most people base their buying decisions on 30-second spots, not informed research, so the cost of educating people to is too high for producers to incur.

    1. Re:More to do with perception by Locutus · · Score: 5, Interesting

      I met some people at COMDEX who have VR(voice recg) running the the Sharp Zaurus. I've run IBM's VR software and it was pretty good 6 years ago. On the Zaurus, I would imagine that at 256MB CF card could hold a good sized dictionary so dictation appears to be possible. Especially since this guy was doing it on a 16MHz 386 years ago.

      The ability of the Zaurus to take a MIC input makes a big difference since a good MIC is important due to noise cancelling features they have. All the PDA's with no external MIC option are pretty much useless for VR/Dictation.

      LoB

      --
      "Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
  3. It's not just the processor... by gpinzone · · Score: 5, Interesting

    It's the other, most overlooked piece of hardware used in speech recognition, the microphone. The junky headset given away with ViaVoice or the el cheapo unit sold in Radio Shack for under $10 makes most people's experiences with voice recognition software less than favorable. Invest in a $50-$60 professional headset and the ability of the software to accurately detect your speech patterns improves dramatically. How are they going to shoe horn a high fidelity audio sound processor in there? Maybe a USB headset might be the answer assuming the device can accept USB devices.

    I'm also going to assume that the current line of speech recognition products are MUCH better than what ran on your old 386.

    1. Re:It's not just the processor... by MobileC · · Score: 2, Interesting

      The main problem is not the microphone.
      It's the microphone circuit on the soundcard.
      My brand new AWE-64 had a crap mic circuit.
      The el-cheapo replacement was excellent.

      --

      Fran
      :):):)
      1st 1st Poster of the new Millennium!

    2. Re:It's not just the processor... by CrazyJoel · · Score: 5, Interesting

      I remember seeing a ViaVoice demo a couple of years ago. The guy doing the demo said they use these headmikes that are actually 2 microphones. One mike faces the mouth, the other faces away. The circuitry then filters out any environmental noise from your voice. Don't know how much they cost though.(I'm sure I could look it up)

      --

      Such is the infinite Grace of Popeye.
  4. i hope that voice recognition never really flies by greechneb · · Score: 2, Interesting

    Can you imagine how bad it would be if everyone switched to voice recognition? Cellphones are bad enough, imagine if everyone was talking to their computers. The noise would be terrible. No matter how quiet you are, the noise would still grow rather large. Would you want to dictate something to your computer that is supposed to be private? Not when anyone can hear it. I'm waiting for something better, whatever it might be.

  5. Dependable dictation by t0qer · · Score: 3, Interesting

    Sorry no links....

    There are dictation services availiable on the net, basically you e-mail them an MP3 and they e-mail back a fully typed document.

    As far as the reason for voice recognition not being on a PDA, I think it's space requirements. Of the two packages i've tried (dragon dictate and IBM) both of them require a lot of disk space to contain the recognition engine and your personal voice pattern files. Much more than your average PDA can hold. We're probably only a few years off from PDA's having that type of storage.

  6. Sharp Zarus + ViaVoice (or dragon for linux?) by Victor+Tramp · · Score: 2, Interesting

    just an idea.. it's a handheld Linux based system, so why would this be such a bad idea? hell, while your at it, install festival, so it can talk back

    yes yes, a scripting nightmare.. perhaps some enterprising programmers could start something on sourceforge or something..

    its not like the technology isn't out there. It's certainly not perfect; the Zarus isn't big on storage space, and it's hardly cheap. and of course countless threads on the imperfection of voice recog.. blah blah.. but good enough is a fine answer on the path to [unattainable] perfection.

    Anyway; Keep It Simple, Stupid:

    Zarus + Microdrive + ViaVoice/Dragon libs [+ festival?] + glueware = handheld voice recognition..

    what's the big deal?

    --
    US$0.02++
  7. Research is underway... by Cyclopedian · · Score: 5, Interesting
    This place at the University of Washington is working on different model of speech recognition that could be conducive to PDA use (low-power, filter out extraneous info).

    Basically, they are working to analyze speech in slices (phonemes) instead of the more computationally intensive task of the whole word. This would lead to a higher success rate and could be easily used across multiple accents of the same language (English, engrish, etc).

    I'm excited about what they could accomplish there.

    -Cyc

  8. Cheap Palms by suitti · · Score: 1, Interesting

    My $150 Handspring Visor Platinum has 8 MB RAM, a 33 MHz Dragonball (68000) processor with no cache or FPU. It is claimed to perform at about 5 MIPS, about what a 386/25 could do. It has a microphone, but it is connected to pins for springboard modules only. You have to have a module to use it. The placement of the microphone suggests that it's there so that a cell phone module could be built.

    8 MB RAM should allow recording for over 5 minutes, uncompressed, of 22 KHz 8 bit sound. This is pretty good quality sound.

    Given that you need a module for this unit anyway, you might add some hardware over just a D/A converter to make speach recognition quicker.

    In any case, there's no reason that it can't be done. In fact, there are Palms with cell phones built in that can dial favorites in the address book using voice commands.

    --
    -- Stephen.
  9. Re:Because by FireballFreddy · · Score: 2, Interesting

    I don't think voice recognition is going to take off much at all, not for the general consumer. I don't thing many people want to spend 8 hours a day talking at their computer (or handheld, as the case may be). I imagine it'd leave you pretty hoarse unless the technology got to the point where you could quietly mumble or subvocalize. There is also a certain amount of privacy that comes with a "quiet" input device... you can hack away at the Linux kernel or type a naughty fantasy to your girlfriend and nobody knows the difference unless they look at your screen. Now imagine speaking each of them at work. ;)

    Frankly I don't want the din of dozens of coworkers talking at their computers around me. I'll stick with my qwerty keyboard. And this means those with physical disabilities will be condemned to a corner of the market, getting less attention and as a result more expensive and less quality products.

    -FF

    --
    SQUEAK, the Death of Rats explained.
  10. Distributed Speech Recognition by kylef · · Score: 3, Interesting

    It is interesting that I JUST did a project on this subject for a Ubiquitous Computing class... My project was called "Distributed Speech Recognition." Here is a link:

    Distributed Speech Recognition Project

    I also have heard it through the grapevine that the big voice recognition companies are working on exactly this technology... I wouldn't be surprised if Speech .NET includes support for something like this in the near future. I believe I read on some website that support for Speech API on PocketPC was coming soon...

  11. We're working on that. by davids-world.com · · Score: 3, Interesting
    Well, it two or three years, you will be able to buy something like that. We're working on it (MIT's Media Lab Europe).

    until recently, the PDA processors were not good enough, but that is changing rapidly (even though there is, in my view, little use for so much power except language technology).

    The resulting dictation systems will not replace conventional keyboard input for a while, however, as recognition rates are .97-.98 (accuracy), and that's a wrong word in at least every second sentence. In comparison to low-bandwith input, however, (as in the PDA with the stylus or as in the author's case due to a fine-motor dysfunction), voice recognition is very competitive.

    cheers from dublin.

  12. Re:Because by WatertonMan · · Score: 3, Interesting
    Speech recognition is only a niche market because of the way it is integrated at present. If there was reliable speech recognition on PDAs then I suspect many people would use that instead of the nearly as unreliable handwriting recognition. (I can speak a lot more clearly than I can write) Further it would be a boom for businessmen on the go. You could dictate notes and letters while driving, for instance.

    Sometimes niche markets turn out not to be. Just look at a lot of "desktop publishing" software. Back in 1986 that was still largely a niche market. Now it is indespensible for many, many people.

  13. I worked on this at MS by rufusdufus · · Score: 5, Interesting

    I worked on dictation and dialogue on a PDA prototype at MS several years ago. It was called MiPad and was pretty cool. Well except that it really had to use a wireless network to a computer to get the recognition done.

    There are a couple of reasons why this hasn't hit the market yet:
    1) the PDAs really are not powerful enough to do decent recognition. Mainly, they don't have good enough audio input systems for reasonable speech quality. Also not enough disk space for dictionary storage. And the cpus are slow and the RAM is too low.

    2) at least at MS it is not a top priority to make speech work for disabled users. Outrageous you say? Not so! Turns out when the speech guys approached the accessability guys on the subject, they learned that speech recognition is not workable in most cases where accessability is needed; that is to say, the market for disabled people who cannot use the keyboard but who CAN use speech input is actually quite small. Most people who don't have the motor function to type (or use some sort of keyed input like Stephen Hawking has) dont have the motor function to speak clearly enough for speech recognition to work. Bottom line: other solutions work better.

  14. Not enough CPU by bluGill · · Score: 3, Interesting

    Sure, a 386 could do vioce recignition, but it required a special card that not only had higher quality sound inputs, but also had some DSPs to do the hard work. When IBM put voice recignition in OS/2 they warned you that a a 486 was not enough. (Several people tried it anyway, and it worked only within narrow limits)

    To emulate a DSP required a lot of floating point math. Most PDAs do not have floating point in the CPU because nothing would use it. The few times it is needed emulation is easy enough, just very slow. No problem though because as I said floating point math isn't much used.

    Don't forget that PDA cpus are not designed for speed above all else. They are designed for low power, which means they have to compromise something and require extra CPU cycles to get something done.

    Finially don't forget power requirements. When doing normal use the CPU is shut down most of the time, and drawing essentially no power. Voice recignition would change that, and your battery life would suffer drasticly.