Slashdot Mirror


eDigital MXP100 with Voice Control

An anonymous reader writes: "Here is a lengthy review of eDigital's 1GB flash MP3 portable that is as much a review on Lucent's remarkable speech recognition technology VoiceNav as it is on the player. VoiceNav offers speaker-independent recognition, meaning it doesn't have to learn each individual user's particular speech patterns like IBM's ViaVoice. Just say the name of a music track into the player's microphone and VoiceNav pulls up and plays that song. In ideal conditions the reviewer was able to twice run through a list of 14 song titles without fail. This included titles with "non-real word" band names like Sum41 and U2. Neat technology that could make its way into PDAs soon. The player is a pretty good one too, using IBM's Microdrive for storage."

50 of 150 comments (clear)

  1. What about Ogg Vorbis support? by Shiny+Metal+S. · · Score: 2

    Have you seen any hardware player of Ogg Vorbis format?

    --

    ~shiny
    WILL HACK FOR $$$

    1. Re:What about Ogg Vorbis support? by radish · · Score: 2


      I belive the standard response whenever this question comes up (once a week or so it seems) is :

      There cannot be support for Ogg Vorbis in a hardware device until someone writes an integer-only decoder. These units do not have FPUs.

      Of course I don't know if anyone has written such a decoder recently, but I see the same response so often I thought I'd repeat it ;-)

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    2. Re:What about Ogg Vorbis support? by Shiny+Metal+S. · · Score: 2
      I belive the standard response whenever this question comes up (once a week or so it seems) is :

      There cannot be support for Ogg Vorbis in a hardware device until someone writes an integer-only decoder. These units do not have FPUs.

      Actually, that's not so standard response. These are the standard ones: The standard response is "I won't use Ogg Vorbis, because it's not popular enough" or "I won't use Ogg Vorbis, because I have already so many MP3s". People seem to forget that they can have MP3 files and Ogg Vorbis files.

      I remember when the best file format for photos available was GIF. That time when I digitalized a photo I stored it as a GIF file. But when I first heard about JPEG, I didn't say "it's nice but not popular". I didn't also say that "I have lots of GIFs and I don't want to convert them". I just started saving the new pictures in JPEG format, leaving the old GIFs alone. Now I have converted those old files to PNG, because of problems with Unisys, but I didn't have to do it, I had been using old GIFs and new JPEGs for many years.

      So your response is quite unique, in the sense that you're talking about technical aspects. But the lack of FPU is not so hard problem.

      When I had 386SX I was writing programs with floating point operations, but I didn't have FPU. At that time, I didn't think about it. Later I found out that my C compiler was emulating the floating point instructions using the standard, integer-only 80386 CPU.

      There are generally two ways of using real numbers without FPU:

      • Emulate the floating point arithmetic, or
      • use the fixed point arithmetic.
      There were time, not so long time ago, when almost no one had a FPU. Still, in some areas, integers resolution was not good enough.
      --

      ~shiny
      WILL HACK FOR $$$

    3. Re:What about Ogg Vorbis support? by radish · · Score: 2


      Point taken, but my response is certainly not unique, I only learned about the problem from a previous answer to the same question posed some time ago on /.

      You are right that in theory you don't need a FPU to do floating point, but emulation (as in your SX) is very slow - try running Q3 with your FPU disabled (if that's even possible) - it'll crawl. So yes, you could run your floating point Ogg codec on the little embedded processor in a Rio or something, provided it (or you) had a library for doing FP emulation (which they don't) and it had sufficient processor power (which they also don't).

      The bottom line is that someone has to write a integer (i.e. fixed point - they are the same thing) version of the Ogg decoder. It can be done, it just hasn't (AFAIK). I'm no where near a good enough C coder to try it, so don't ask ;-)

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

  2. OT: Why should it support Ogg? by Mdog · · Score: 2, Insightful

    I think I'm feeding the trolls on this one, but I can't understand why you think a company would spend money on adding support for that format unless it would be a selling point. I grant that mp3 is worse than ogg, but can you honestly say that ogg is big enough in the "real world" for a company to go to the trouble of supporting it? The vast majority of my linux using friends still use mp3, and you can bet almost no one in the windows world uses ogg.

  3. less than ideal conditions? by kithkaddith · · Score: 2, Interesting

    wonder how well it would work on, say, the side of a highway. if it worked well this would be a nice little toy for those of us who run (or bike) around.

    --
    Kith Kaddith Lizard Man Extraordinaire
    1. Re:less than ideal conditions? by oregon · · Score: 2, Insightful

      From the article ...

      Test 2 - Walking outside with occasional traffic passing by. All track names said in proper order. - Result: very good to excellent

      --

      ---
      Oregon
  4. How can it differentiate... by Anonymous Coward · · Score: 2, Interesting

    I guess I have too many obscure mp3s, but how can the voice control differentiate:

    Daydream Boat.mp3
    Day Dreamboat.mp3

    Alpha Betray.mp3
    Alphabet Ray.mp3

    Mont Anagram.mp3
    Montana Gram.mp3
    ...

    1. Re:How can it differentiate... by Fjord · · Score: 2

      If it's really that big of a problem, you could always rename/retag the files to me more different.

      --
      -no broken link
  5. No Star Trek by fm6 · · Score: 2
    Bearing in mind that speech recognition not yet the equivalent of the chatty computer on TV's Star Trek...
    Let us be grateful for small favors!
    1. Re:No Star Trek by fm6 · · Score: 2
      it's always puzzled me, why does Data have to ask the ship for things? Why doesn't he have a tricorder built into him?

      It's like "we need to make the equipment purposely braindead or else the viewers won't know wtf. is going on.

      You've answered your own question. Bad TV shows don't trust their audiences to figure things out. So the characters waste a lot of time telling each other things they already know, but the audience might not. That's also why characters tend to be stereotypes. (Hence the "teaching black actors to act black" scene in Hollywood Shuffle.) Picard is brave-but-awkward, Worf is absurdly obsessed with "honor", yada yada.
  6. Re:Filters by d5w · · Score: 3, Informative
    Does the voice recognition filter itself out? When U2 sings "one" I don't necessarily want it switching to Aimee Mann's "one" and vice versa

    From the review:

    Navigation using VoiceNav only operates when a song is not playing (manual controls will allow navigation when a tune is pumping), therefore there is no "Stop" or "Pause" command.
    So they punted on that problem.

    On another front, tt looks like "one" isn't likely to produce useful responses from the speech recognition in any case. The only times the reviewer seems to have gotten acceptable recognition of track names were when saying the entire artist and title.

  7. Let's hope there's no RIAA back door . . . by base3 · · Score: 2, Interesting

    . . . otherwise, there'll be a special broadcast on radio, cable, and embedded in trojan MP3s one day. It'll be Jack Valenti's voice saying "Don't play non-SDMI compliant content anymore." :).

    --
    One CPU cycle wasted on digital restrictions management is ONE TOO MANY.
  8. Re:Cool technology by mshomphe · · Score: 2

    You can get a domain-independant NL parser from www.sil.org, the PC PATR II parser. You may have to write a few grammar rules...

    --
    She sat at the window watching the evening invade the avenue.
  9. Even worse than cell phone by Ezubaric · · Score: 4, Funny

    When I can't get voice rec to work, I usually end up speaking louder because the frustration is just too much. It's bad enough listening to people yapping down the street or in stores with those little embedded mikes and earphones. Can you imagine hordes of people walking down the street screaming:

    "Uncle Fucker"
    "Baby Got Back"
    "Cocaine"
    "Cocacabana"

    The last is probably worst of all. We know Barry exists, but it's horrible to be reminded that people actually listen to him.

    --

    ----------
    I am an expert in electricity. My father held the chair of applied electricity at the state prision.
    1. Re:Even worse than cell phone by Bilestoad · · Score: 3, Funny

      Even better - imagine being able to sneak up on people with one of these, and saying

      "Kenny G"

    2. Re:Even worse than cell phone by majcher · · Score: 2

      "Uncle Fucker"
      "Baby Got Back"
      "Cocaine"
      "Cocacabana"

      I don't know where you live, but on the streets of New York and San Francisco, you're lucky if that's all the people on the street are muttering to themselves...

  10. Re:Voice Recognition by d5w · · Score: 4, Insightful
    Voice recognition on computers has been around for a while now with products like Dragon, Via Voice, etc. All of these programs are clunky, somewhat bloated, and need to be trained to individual speakers. A truly speaker-independent voice recognition system could be just what the doctor ordered for Lucent.
    This kind of thing comes up every time speech recognition is mentioned here, and it's largely missing the point. Desktop speech recognition, as handled by Dragon NaturallySpeaking, is a very different problem from simple commands and list selection, and it has very different solutions. If you have to recognize and transcribe arbitrary sentences in a given language you have to handle a much larger search space in basically every dimension -- so much larger that the optimal search techniques can be very different, and (as in your comment) the resources required to implement those techniques will be incomparable.

    I won't say the problems are fundamentally different, because the fundamentals are much the same between the two domains; but nearly every detail of the implementation of those fundamentals is likely to be different.

  11. Just to clear things up by Anonymous Coward · · Score: 3, Insightful

    IBM's voice recognition line extends past ViaVoice. We offer several products, including an embedded product, that do not require any training. Only the highest end dictation product requires training because of the demands on it to understand what you just said from tens of thousands of words. If all you can say is a hundred or so phrases like "play", "stop", "rewind", "livin' la vida loca", etc. then it's a lot easier to make a prediction and training is a waste of time. At that point it's just a matter of microphone quality and filtering out the background noise. We can even do untrained natural language voice recognition in situations like this with the proper processor power. Since we know what you're by and large going to say, we can pick out enough from the whole free-form sentence to get the gist of what you meant without any training.

    And believe me we're getting to the point where training isn't needed for dictation either :)

  12. Playlist with 14 entries isn't enough by hovik · · Score: 5, Informative
    In ideal conditions the reviewer was able to twice run through a list of 14 song titles without fail.

    This doesn't mean much. To pick the correct one between only 14 possible is quite easy. The reviewer should rather have tried with a playlist with more than 3000 entres. The error rate will grow exponatially with the number of songs, because statisically more song will be phoneticly more equal, the more you add. (bad way to say it, but you prob get the point)
    1. Re:Playlist with 14 entries isn't enough by sean23007 · · Score: 2, Insightful

      The error rate will grow exponatially with the number of songs, because statisically more song will be phoneticly more equal, the more you add. (bad way to say it, but you prob get the point)

      See sig. Wow.

      --

      Lack of eloquence does not denote lack of intelligence, though they often coincide.
    2. Re:Playlist with 14 entries isn't enough by radish · · Score: 2

      Or maybe it means he went through a list of 14 random selections from a play list of 500? Sounds like a pretty good test to me.

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

  13. Ogg Vorbis by autopr0n · · Score: 2

    Ogg is just the name of the, uh, 'group' doing the work. The actual audio format is called Ogg Vorbis, in contrast with Ogg Tarken, their proposed video codec.

    So your sylable count is really incorrect :P

    --
    autopr0n is like, down and stuff.
  14. Moving parts by DodgyGeezer · · Score: 2, Interesting

    For me, the biggest attraction of MP3 players is the ability to have no moving parts. This makes it truly portable and useful in more situations that what we had previously. So, my question is, how reliable is this IBM microdrive? How robust is it? If I'm training for to run a marathon, is it going to survive all of the pounding?

    1. Re:Moving parts by garcia · · Score: 2

      b/c the microdrive is not really FLASH it is an actual HD in there it would probably not survive the pounding that running gives...

      In a post a while back someone mentioned that they dropped their Microdrive from a height of about 4 ft onto a carpeted floor and it never worked again. I would suspect that long-term pounding from running would have much the same effect.

    2. Re:Moving parts by jovlinger · · Score: 2

      Interesting. I would have thought the smaller the drive the more able it was to withstand acceleration (what with the mass shrinking as the cube, but torsional strenght as the square, of the feature size).

      Any experience with the toshibas that Apple uses?

  15. Style. by autopr0n · · Score: 2

    Hrm, the thing dosn't look quite as cool as the ipod. Not that I don't hate apple or anything, but there don't seem to be a lot of players out there that have both a high capacity and the esthetic styling approaching or surpassing the iPod. There are some cool looking mp3 players, and there are some that are better technically then the iPod. But unfortunately, they don't seem to be in the same group. (of course, given the price you could just get a real PDA that can play mp3s for a bout $100 more...)

    Personally, I doubt the voice nav in the current system is really that great, especially since you have to manually stop the music in order to use it. Of course with 200 or so songs it might come in handy (if it scales that well).

    --
    autopr0n is like, down and stuff.
  16. Hype Company by Anonymous Coward · · Score: 4, Informative

    edigital has a long history of using hype and grossly misleading tactics to, IMO, defraud investors. So far they've lost tens of millions of dollars, and recently had to resort to taking a loan at a 49% interest rate just to stay in business. Even the CEO has referred to the investors as a "cult".

    As for their history with their products, their much-hyped Treo barely sold any units in stores, and is now being sold by liquidators on ebay. A lot of customers were a bit pissed that their players didn't come with any storage media!

    This wasn't intended as flamebait, but E.digital has a long history of using hype and misleading tactics to pursue little more than an incursion of investment money from gullible public investors. I didn't lose any money to them, but a lot of people did, and will continue to.

    In fact, they recently registered 20 million more shares so they can stay in business a while longer. They really don't deserve this kind of attention from Slashdot.

    For those considering investing in them, I'd say stay away. For those considering a product purchase, I'd recommend the same.

  17. Do we really need voice control? by asv108 · · Score: 2
    Voice navigation systems are cool and they definitely have a "gee wiz" factor, but are they really useful? Sure they have a very short learning curve, but people tend to use alternative navigation methods after using the product for awhile. I remember having voice nav way back in 93 with the soundblaster AWE32. That was really cool back then, but nobody actually used it. Sure voice nav on the computer is much more reliable now via products such as viavoice and dragon, but both those products aren't nearly as fast for an midly experienced using point and click or especially keyboard shortcuts.

    I have a lot of friends who have sprint phones with voice nav. They all used it for the first week because it was "cool" but after awhile, they went back to traditional methods. Another example is my father; he got the 02 Infinity Q45 which has loads of tech toys built in. The voice nav is really cool but it's not nearly as fast a clicking a button.

    1. Re:Do we really need voice control? by Tazzy531 · · Score: 2

      The only thing I could see is that with up to 1 gig of MP3s..that's approximately 500 songs. It might be difficult to scroll through the list to find a particular song using those tiny buttons. Also, if you were driving or walking or doing something else, you don't want to have to keep looking down to change songs. But you're right, to the most part, consumers either love it or look at it as a fad.

      --


      _______________________________
      "I'm not Conceited...I'm just a realist..."
    2. Re:Do we really need voice control? by d5w · · Score: 2
      Voice navigation systems are cool and they definitely have a "gee wiz" factor, but are they really useful?
      Yes! Yes! Everyone needs speech recognition! Everywhere!

      Oh, wait, I don't work in that business anymore. Never mind.

      That said,

      It might be difficult to scroll through the list to find a particular song using those tiny buttons
      List selection is one of the areas where speech recognition can really shine. The recognition task is usually fairly easy (or, in the case of phonetic ambiguity, impossible), and it fills a real gap in the other available interfaces. On the down side, though, when it goes wrong it's a pain to correct a mistake. "No, I meant that other one of the six thousand items in the list."
  18. I was wondering when they'd come out with this by flacco · · Score: 3, Funny

    ...only I pictured it with the ability to retrieve a song by just singing a bit of it or speaking some lyrics.

    --
    pr0n - keeping monitor glass spotless since 1981.
    1. Re:I was wondering when they'd come out with this by flacco · · Score: 2

      hmm, why was this moderated "funny"? I'm serious - how many times have you had a tune in your head and you've been unsure of the title?

      --
      pr0n - keeping monitor glass spotless since 1981.
  19. Re:I wonder how it would handle japanese songs.. by 90XDoubleSide · · Score: 2
    i have yet to see a player (just me here) that supports kanji...

    You're telling me that you read Slashdot and you've never heard of the iPod?!?

    --
    "Reality is just a convenient measure of complexity" -Alvy Ray Smith
  20. needs to use CDs and support ogg by ukyoCE · · Score: 2

    If this thing ran off CDs and supported ogg vorbis I would buy this in an instant. As it is i'm forced to drool over the spiffy voice recognition and keep waiting...

  21. No No No and No again by Kirkoff · · Score: 2

    It's tempting, but I won't go for it. I'm too much of a They Might Be Giants fan. I can see it now, sitting there in a public area with some weird looking device in my hand:
    "PUT YOUR HAND INSIDE THE PUPPET HEAD!"
    "...NO!" Someone speaks to me "Are you OK?"
    "Yeah Yeah," Yeh Yeh starts playing. "Ahh!"
    "DIG MY GRAVE"
    "Sir, are you sure you're alright?" [stopping]
    "Yeah, fine." suddenly person A asks person B for a light. "I've got a match."
    The thing starts playing agian. Just then a Dirt Bike wizzes by and someone says "Man, that's a fast Dirt Bike." Guess what song starts playing. Then I stop it so I can play "I AM A HUMAN HEAD!" again getting more stares.

    Then what if I want to hear Chuck Berry? "MY DINGALING" *SMACK*

    No, for me this is nothing but trouble...

    --Josh

    --
    There are exactly 42,935,718 letter sized sheets in a square mile.
  22. If you don't mind better quality... by l1nuxhax0r · · Score: 2, Informative

    you might be interested in the fact that this has already been done

    1. Re:If you don't mind better quality... by ZxCv · · Score: 2

      Granted, I've never used an iPod, but I'd be curious as to how easy it is to upgrade the harddrive in it. With this thing, just pop in a new Microdrive. The most interesting thing about this new one, though, is the voice recognition. Does the iPod have it?

      --

      Perl - $Just @when->$you ${thought} s/yn/tax/ &couldn\'t %get $worse;
  23. It's the money, stupid by squarooticus · · Score: 2, Interesting

    The only reason we haven't seen OGG Vorbis support on solid state players is that they would only lose money by doing so, at least for now. This is coming from someone who encodes all of his own CD's as .ogg's.

    Alas, I wish there were some incentive for player manufacturers to add the support. There are two ways I can see for this to happen:

    (a) Make adding it as trivial as possible. If adding .ogg support required only a few days of extra development time, you'd see it.

    (b) Increase the market share that OGG Vorbis has. This one is trickier, mainly because of the slim market that a good, lossy codec serves. What do I mean? Well, audiophiles aren't going to want to listen to any compressed format (though these dinosaurs claim their hissy records are better-sounding than Super Audio CD), and Joe Sixpack isn't going to notice any difference at all between .mp3 and .ogg.

    Having done numerous sound quality tests of OGG Vorbis and MP3 on my own equipment, I can say without a doubt that were all things considered equal, OGG would win out. Unfortunately, OGG has had a very late start, and is up against lots of other competitors who are all "good enough" for the average person, so its supporters will have to reduce the barriers to its use before anyone will care.

    --
    [ home ]
  24. But what if... by zhensel · · Score: 2

    What if someone tries queing up their favorite track from The Faint's Danse Macabre.

    1. Re:But what if... by zhensel · · Score: 2

      You obviously haven't seen the back cover of the faint's album :)

      I tried desperately to find an image, but no luck. They're not French either, that is unless there's a lot of new-new-wave-indie-rock-frenchmen in Omaha.

      I do, however, concur that some of Aphex Twin's tracks might be a bit harder to pronounce.

  25. Re:It's the libraries, stupid by mike_g · · Score: 2, Interesting

    The only reason we haven't seen OGG Vorbis support on solid state players is that they would only lose money by doing so, at least for now. This is coming from someone who encodes all of his own CD's as .ogg's.

    Actually I think that the only thing stopping OGG Vorbis on hardware players is the lack of a free fixed point decoding library. Right now you can find free floating point decoding libraries, but not fixed point. Most of the processors used in hardware players do not support floating point operations. The CPU's only have an integer unit. When a fixed point library is released, I think that you will find Ogg supported everywhere that MP3 is, since it should be trivial to add, and will only take up a little more ROM.

  26. New tech gets it first. by Restil · · Score: 2

    Portable MP3 players of all things get the voice tech first. Why? Same with phones. The cell phones have the voice recognition, but if there are POTS phones that have it, they aren't exactly making commercials about it (not that I watch TV anyways)

    This feature would be no less useful on a desktop. It's definitely ideal for a small portable unit where working with a tiny display screen and buttons to switch between a large selection of songs can be tedious. However, being able to swap songs by simply speaking to your computer without forcing yourself to do a task switch could be helpful as well. Certainly, the 10-20 seconds you spend doing so isn't significant by itself, but this does add up over time. Its all about productivity people!

    MP3 players are pioneering the way in other areas as well. Other than perhaps digital cameras, they provide a market for flash memory. And getting realtime playback, and hopefully soon widespread use of unrestricted realtime mp3 encoding for these units, will enhance their use beyond the simple playback of music. And of course, don't forget, anything that pisses off the RIAA is a good thing. :)

    -Restil

    --
    Play with my webcams and lights here
  27. Voice = especially good for mp3.. but the accents? by wackybrit · · Score: 2

    Because of the amount of songs mp3 allows us to carry around, indexing the songs we have with us is a tricky thing. There are numerous indexing methods on MP3 players at the moment.. playlists on the iPod, simple numeric 'album' jumps on MP3-CD players, search facilities on in-car units etc.. but voice definitely simplifies matters.

    However, I spy a problem. Even if it doesn't require training to recognise a voice, I bet it's still limited to a subset of accents.

    You notice it with voice-recognition computer programs here in the UK. You speak normally and it rarely works.. put on the dullest most monotone American-style accent you can, and hey presto, up and running!

    So, to get one of these, is a prerequisite that I practice my 'dull American drone'?

  28. My eDigital pain... don't let this happen to you. by jbuilder · · Score: 2

    I might get modded down for this, but eDigital has just left a bad taste in my mouth..... And I wanted to share... ;)

    I personally see this as being *on* topic because before you buy something from eDigital let me tell you what you *might* just be in for.

    I'll do a condensed version of my story and just say "don't let this happen to you". I got a Treo 10 MP3 Jukebox from http://www.treoplayer.com for an xmas present. I'll be looking for a new xmas present.

    My Treo 10 was basically D.O.A. the unit's harddrive would lock up during playback.

    It took me *one month* to get an RMA number.

    When I got *finally did get* the RMA number and sent the unit back I was to "promptly have a new unit sent" to me.

    This didn't happen. The Treo 10 is on back order and no replacements will be sent out until *APRIL*. Like I'm going to wait three months for a replacement.

    SO, I demanded a full refund. Their main support center said 'OK'. I got my credit email today and was told they were going to keep 15% for a "restocking fee" (?!?!?).

    So, I called -- again -- raised hell, and am finally getting a full refund.

    During this time, I went back to doing realtime recording of MP3's using my Sony MZR-900 (minidisc Walkman) and my digital soundcard. What I found was that the sound quality of my MP3's coming off my computer and onto my MD Walkman was *better* sounding than anything coming out of the Treo 10. I guess there's something to be said for Sony's D/A chips. I also re-discovered how convenient the MD Walkmanis.. It, and 3 Minidiscs easily fit in my coat pocket. I also have more than enough battery power to get through the day at the office.... And MDLP 4 mode is certainly livable enough for my needs. Hell it *still* sounds better than a cassette tape walkman if you ask me and I can 'boost' highs and lows to compensate for the sound loss during compression via WinAMP if I need to.

    So that's it. No more MP3 jukebox BS for me. I'll stick to what works. And if you *do* decided to get an MP3 juke box - avoid eDigital like the PLAUGE! Their customer service is horrible and
    their product when it *does* work is only of passible sound quality.

    --
    Polymorphism -- It's what you make of it.
  29. Open Source speech recognition? by Ogerman · · Score: 2

    In one form or another, speech recognition is going to be used more and more in the future, perhaps especially with handheld devices and tablet PC's. So, in light of this, who is working on Open Source speech recognition. I'm aware of CMU's Sphinx project, but last I saw it was quite obsolete technologically compared to commercial offerings. Is there any other Open Source'd work being done with cutting edge SR techniques?

  30. /. story is innaccurate by acoustix · · Score: 2, Informative

    No where in the actual article does it say that it uses "1GB flash" cards. However, the IBM microdrive does store that much data (340 MB, 512 MB or 1GB).

    As far as I know the "SanDisk-compatible CompactFlash(TM) Cards" max out at 128 MB.

    They might want to update the article seeing how it may get some people's hopes up.

    --
    "A plan fiendishly clever in its intricacies"- Homer Simpson
  31. Voice recog? by Sj0 · · Score: 2

    I hope voice recog is better than the last time I used it!

    Trying to load stairway to heaven:

    "Stairway...delete that...Stairway...delete that...no! Delete that!...Shit...delete that...delete that...delete that... Stairway...to...delete that...to...delete that...to...delete that...to...heaven...delete that...heaven...delete that...heaven...delete that...heaven...play...delete that...play...delete that...play...delete that...play...delete that..."

    :)

    I hate voice recognition.

    --
    It's been a long time.
  32. Re:Voice Recognition by marphod · · Score: 2, Informative

    ACtually, I work in this field.

    Dragon, ViaVoice, etc. are dictation recognizers. They work by analyzing the speech data, and attempting to do phoneme matching to generate words, from a huge dictionary, and then do word matching.

    This isn't an overly exciting model for different reasons. Large vocabulary recognizers have been around for 8-10 years. Nuance, SpeechWorks, Philips, and Temic end up being the big four in this market, allthough there is also a large vocabulary implementation of ViaVoice and others.

    These products take a fixed grammar set, compile them in an speaker-indepedant manner, and can be used to recognize the compiled grammar. Without getting overly techincal, it is a very different speech recognition method than the dictation recognizers, as they aren't trying to recognize everything out of a dictionary, but simply out of what the known grammar is. The flexiblity in how the user can phrase the requests is small, but for relatively simple tasks, its a fine trade off.

    Look at SprintPCS's VoiceCommand for example. (I was one of the writters of the product -- not the handset based recognition, but the serverside voice activated dialing solution). The idea is very similar, but we handle the concept a little differently.

    This type of device is just waiting to happen. With VoiceXML designing tools like this will be standardized, but its not anything new, just a use of existing technology.

  33. Re:Voice Recognition by d5w · · Score: 2
    Large vocabulary recognizers have been around for 8-10 years. Nuance, SpeechWorks, Philips, and Temic end up being the big four in this market, allthough there is also a large vocabulary implementation of ViaVoice and others.
    Um... You meant "small", didn't you?