Slashdot Mirror


Siri, Cortana and Google Have Nothing On SoundHound's Speech Recognition

MojoKid writes: Your digital voice assistant app is incompetent. Yes, Siri can give you a list of Italian restaurants in the area, Cortana will happily look up the weather, and Google Now will send a text message, if you ask it to. But compared to Hound, the newest voice search app on the block, all three of the aforementioned assistants might as well be bumbling idiots trying to outwit a fast talking rocket scientist. At its core, Hound is the same type of app — you bark commands or ask questions about any number of topics and it responds intelligently. And quickly. What's different about Hound compared to Siri, Cortana, and Google Now is that it's freakishly fast and understands complex queries that would have the others hunched in the fetal position, thumb in mouth. Check out the demo. It's pretty impressive.

9 of 235 comments (clear)

  1. Reasons to be skeptical by sideslash · · Score: 5, Insightful

    1. This demo was likely created by an engineer or sales person with SoundHound. More impressive would be a demo by a third party journalist or reviewer without a vested interest.
    2. The impressive speed probably won't scale to the millions of simultaneous users Siri, Google Now, and Cortana support (assuming audio is processed in the cloud, which I admittedly don't know for sure).
    3. Obviously the demo uses phrases that work. I guarantee you an ordinary person will often get "Sorry, I didn't understand the question" or whatever SoundHound's equivalent is.
    4. While it sounds impressive at first blush, nobody really cares how many days it is between next Tuesday and Christmas of 2025. And that happens to be not only useless, but also pretty easy to special-case in your expert system / AI logic. So how about a demo that answers the question: "How can you make a mushroom omelette without soggy mushrooms?"

    1. Re:Reasons to be skeptical by TWX · · Score: 5, Insightful

      It's also good to be skeptical if this thing doesn't do all of the work on the handheld device and simply send the parsed text to the search engine or other central server to retrieve only the relevant information.

      I mean, c'mon already! I had Dragon running on a friggin' Macintosh LCII in elementary school! That thing was running System 7.1 on a Motorola 68030 with 4MB RAM. Why cant my multi-Gigahertz smartphone with 64GB storage and 4GB RAM do the basic speech-to-text locally that a 25 year old Macintosh can?

      --
      Do not look into laser with remaining eye.
    2. Re:Reasons to be skeptical by phantomfive · · Score: 3, Insightful

      BSD FORTUNE #42!
      "Any sufficiently advanced technology is indistinguishable from a rigged demo."

      --
      "First they came for the slanderers and i said nothing."
  2. Wow ... is this real? by golodh · · Score: 3, Insightful
    If this is true (and not a trick) then we've just seen the beginning of the end for human-staffed customer service call-centres.

    Script reading call-centre staff will be made redundant or downsized.

    Banks, utilities, booking agencies, insurance sales ... all will use automated customer service, perhaps with switch through to a human operator on demand (at which point higher charges will kick in).

    And brace yourself for robotic surveys and sales calls that sound uncannily like real people.

  3. Charming by wonkey_monkey · · Score: 5, Insightful

    Your digital voice assistant app is incompetent. ...bumbling idiots trying to outwit a fast talking rocket scientist. ...
    hunched in the fetal position, thumb in mouth.

    Do you have to be such a douche about it?

    --
    systemd is Roko's Basilisk.
  4. Re:Holy shit by Anonymous Coward · · Score: 2, Insightful

    This is a dice holdings property you are referring to... Slashdot is here for pushing certain political buttons (to keep the readership "engaged") and for advertising to this "engaged" readership (to make money). Slashvertising will only get more aggressive as the readership declines in an attempt to make up falling revenues.

  5. Re: Yes, but can it launch Waze by AvitarX · · Score: 1, Insightful

    The e at the end of a word like that (one consonant between it and a vowel) makes the vowel say its name. The z / s sound essentially the same in was/waze/ways, though perhaps in some areas ways has a softer s, the a is very different in was to waze. Think daze with a w instead of d.

    --
    Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  6. Re:pretty weak by GameboyRMH · · Score: 3, Insightful

    If that's true, why don't we already have programs that can make sense of human questions like this in text form?

    --
    "When information is power, privacy is freedom" - Jah-Wren Ryel
  7. A more honest title by r1348 · · Score: 5, Insightful

    "Please buy us out!"