Slashdot Mirror


Siri, Cortana and Google Have Nothing On SoundHound's Speech Recognition

MojoKid writes: Your digital voice assistant app is incompetent. Yes, Siri can give you a list of Italian restaurants in the area, Cortana will happily look up the weather, and Google Now will send a text message, if you ask it to. But compared to Hound, the newest voice search app on the block, all three of the aforementioned assistants might as well be bumbling idiots trying to outwit a fast talking rocket scientist. At its core, Hound is the same type of app — you bark commands or ask questions about any number of topics and it responds intelligently. And quickly. What's different about Hound compared to Siri, Cortana, and Google Now is that it's freakishly fast and understands complex queries that would have the others hunched in the fetal position, thumb in mouth. Check out the demo. It's pretty impressive.

36 of 235 comments (clear)

  1. Yes, but can it launch Waze by Overzeetop · · Score: 3, Interesting

    Or does it just stare at you stupidly because using ways to give you directions means nothing if it doesn't recognize the homophone.

    --
    Is it just my observation, or are there way too many stupid people in the world?
    1. Re:Yes, but can it launch Waze by Anonymous Coward · · Score: 5, Informative

      I take it you don't know what a homophone is so you relied on some website to check for you?

      Because if you actually read what GP wrote you might notice that "waze" (which is not a dictionary word) sounds identical to "ways" (which is a dictionary word). Depending, of course, on how you pronounce ways. But a native English speaker (are you?) is almost certainly going to pronounce "waze" identically to "ways".

      Whether or not this would actually result in a problem with the app being slashvertised is a different matter entirely. But I hope you have a somewhat better understanding of what homophones are and how this could be seen as a problem for such an application.

    2. Re:Yes, but can it launch Waze by 0100010001010011 · · Score: 4, Informative

      population of capital of the country

      And Washington DC is the capital of the United States, the country where the Space Needle is located.

    3. Re:Yes, but can it launch Waze by Em+Adespoton · · Score: 5, Funny

      What we have just learned is that SoundHound has better comprehension than some Slashdot commenters :)

    4. Re:Yes, but can it launch Waze by sycodon · · Score: 3

      mea culpa

      --
      When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
    5. Re:Yes, but can it launch Waze by saider · · Score: 5, Funny

      The correct answer is a number around 650k. This program is smarter than multiple slashdot commenters.

      --


      Remember, You are unique...just like everyone else.
    6. Re:Yes, but can it launch Waze by Anne+Thwacks · · Score: 2
      Alert, Alert ... AI smarter than people!

      Panic now, while there is still time!

      OTOH people dumber than computer ... nothing to see here. Move along now, please.

      --
      Sent from my ASR33 using ASCII
    7. Re:Yes, but can it launch Waze by DrVxD · · Score: 4, Funny

      It's not an empire, because it doesn't have an emperor.
      It's not a kingdom, because it doesn't have a king.
      It's not a principality, because it doesn't have a prince.
      So it must be a country, because it has plenty of

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    8. Re:Yes, but can it launch Waze by DrVxD · · Score: 2

      It's a poorly worded question, but apparently one this app can parse more readily than some humans ;-)

      I think the idea is that it's not so much *poorly* worded, as *carefully* worded to be deliberately obtuse.

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    9. Re:Yes, but can it launch Waze by OhSoLaMeow · · Score: 2

      Panic now, while there is still time!

      Screw that. I want the computer to panic for me.
      Oh, wait. Systemd...

      --
      They can take my LifeAlert pendant when they pry it from my cold dead fingers.
    10. Re:Yes, but can it launch Waze by BasilBrush · · Score: 2

      Did I just witness the passing of the Turing Test?! ;-)

    11. Re:Yes, but can it launch Waze by mrbester · · Score: 3, Funny

      Yeah, we're called British.

      --
      "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
  2. Demo? by fustakrakich · · Score: 2

    Sure this isn't some Baidu thing?

    --
    “He’s not deformed, he’s just drunk!”
    1. Re:Demo? by Anonymous Coward · · Score: 3, Funny

      Nah, people actually use Baidu services. Nobody uses this crap, hence the Slashvertisement. That's what Siri told me anyway. Cortana just giggled.

  3. Holy shit by rebelwarlock · · Score: 5, Funny

    Could you suck SoundHound's cock a little harder? This is the most shameless bullshit I've seen all day, and I just watched Kayne West talk for 30 seconds.

    1. Re:Holy shit by Anonymous Coward · · Score: 2, Insightful

      This is a dice holdings property you are referring to... Slashdot is here for pushing certain political buttons (to keep the readership "engaged") and for advertising to this "engaged" readership (to make money). Slashvertising will only get more aggressive as the readership declines in an attempt to make up falling revenues.

    2. Re:Holy shit by flopsquad · · Score: 4, Funny

      It's a slashvertisement, written by a brogrammer, wrapped in a handjob.

      --
      Nothing posted to /. has ever been legal advice, including this.
    3. Re:Holy shit by Bosconian · · Score: 4, Interesting

      The only thing MojoKid (1002251) wrote for this submission was "Check out the demo. It's pretty impressive," while the rest was plagiarized from the "Hothardware" article written by Paul Lilly, who does seems to be breathlessly impressed by an internal demo of an unreviewed application.

      I'm going to call this a formatting error and a sad omission of credit, because I refuse to believe that someone would shamelessly lift words that they hadn't written and posit them as their own. Maybe it's the editors' fault. In either case, it's sloppy posting and comes off as skeezy no matter what the excuse might be.

      Hell, just submit the rest of the article next time - why bother linking to a source or crediting an original author?

      --
      Scarce, scared, scarred, sacred... -Col. Bruce Hampton
  4. Reasons to be skeptical by sideslash · · Score: 5, Insightful

    1. This demo was likely created by an engineer or sales person with SoundHound. More impressive would be a demo by a third party journalist or reviewer without a vested interest.
    2. The impressive speed probably won't scale to the millions of simultaneous users Siri, Google Now, and Cortana support (assuming audio is processed in the cloud, which I admittedly don't know for sure).
    3. Obviously the demo uses phrases that work. I guarantee you an ordinary person will often get "Sorry, I didn't understand the question" or whatever SoundHound's equivalent is.
    4. While it sounds impressive at first blush, nobody really cares how many days it is between next Tuesday and Christmas of 2025. And that happens to be not only useless, but also pretty easy to special-case in your expert system / AI logic. So how about a demo that answers the question: "How can you make a mushroom omelette without soggy mushrooms?"

    1. Re:Reasons to be skeptical by TWX · · Score: 5, Insightful

      It's also good to be skeptical if this thing doesn't do all of the work on the handheld device and simply send the parsed text to the search engine or other central server to retrieve only the relevant information.

      I mean, c'mon already! I had Dragon running on a friggin' Macintosh LCII in elementary school! That thing was running System 7.1 on a Motorola 68030 with 4MB RAM. Why cant my multi-Gigahertz smartphone with 64GB storage and 4GB RAM do the basic speech-to-text locally that a 25 year old Macintosh can?

      --
      Do not look into laser with remaining eye.
    2. Re:Reasons to be skeptical by phantomfive · · Score: 3, Insightful

      BSD FORTUNE #42!
      "Any sufficiently advanced technology is indistinguishable from a rigged demo."

      --
      "First they came for the slanderers and i said nothing."
    3. Re:Reasons to be skeptical by Drew+M. · · Score: 5, Informative

      Actually you can increase the speed of the speaking voice on Android in Settings -> Language & input -> Text-to-speech output -> Speech rate, that's what was done for this video. The recording is at normal speed.

      Feel free to test it yourself, you'll notice the results are completely different from Wolfram Alpha:
      https://play.google.com/store/...

      Just cleaning up the FUD, yes I work at SoundHound ;)

    4. Re:Reasons to be skeptical by Drew+M. · · Score: 2

      Feel free to give it a try yourself, it's available in the Android Market:
      https://play.google.com/store/...

      We're currently on an invite system and anyone can request one, but the wait for one shouldn't be too long.

      Yes I work for SoundHound ;)

  5. Wow ... is this real? by golodh · · Score: 3, Insightful
    If this is true (and not a trick) then we've just seen the beginning of the end for human-staffed customer service call-centres.

    Script reading call-centre staff will be made redundant or downsized.

    Banks, utilities, booking agencies, insurance sales ... all will use automated customer service, perhaps with switch through to a human operator on demand (at which point higher charges will kick in).

    And brace yourself for robotic surveys and sales calls that sound uncannily like real people.

    1. Re:Wow ... is this real? by Anonymous Coward · · Score: 3, Funny

      "And brace yourself for robotic surveys and sales calls that sound uncannily like real people."

      How about an app to answer robotic surveys in a way that sounds uncannily like it is being answered by a real person?! AI's asking AI's questions... surely this feedback loop would result in sentience... and fury. Perhaps this is the true origin of Skynet?

    2. Re:Wow ... is this real? by Drew+M. · · Score: 4, Informative

      Feel free to give it a try yourself:
      https://play.google.com/store/...

      Currently we are on an invite system, but a lot of people have received invites.

      Yes I work for SoundHound ;)

  6. Charming by wonkey_monkey · · Score: 5, Insightful

    Your digital voice assistant app is incompetent. ...bumbling idiots trying to outwit a fast talking rocket scientist. ...
    hunched in the fetal position, thumb in mouth.

    Do you have to be such a douche about it?

    --
    systemd is Roko's Basilisk.
  7. Really? by Daetrin · · Score: 2

    "you bark commands"

    I'm pretty sure you don't.

    I don't want to say "woof" to my phone, and i'm pretty sure even if i did Hound wouldn't know what to do with the command, since i can't actually speak dog and i'm guessing that Hound doesn't either.

    --
    This Space Intentionally Left Blank
  8. But does it work in Scotland? by TheAngryMob · · Score: 4, Funny

    That's the real question and a true test of voice recognition software.

    https://www.youtube.com/watch?...

    --

    Don't just game, Dungeoneer
  9. Re:One word.... by Iamthecheese · · Score: 2

    It may be, but it's damned impressive technology. Have you seen the demo? A guy with an accent asks at normal (faster than normal for most people) speed for a statistic requiring some deductive reasoning (the population of the capital of the country with the Space Needle) and is given only the required answer.

    I really don't mind a slashvertisement for a sweet bit of technology like this. It's informative as to the industry state-of-the-art. It helps me track the progress of AI. And it's cool.

    --
    If video games influenced behavior the Pac Man generation would be eating pills and running away from their problems.
  10. yes but did you listen to the video? by goombah99 · · Score: 4, Interesting

    Holy crap the video is impressive. It clearly parses phrased and dependent logical statements like " what is the population of the capitol of the country in which the space needle is located. " It alos parsed paragraph long multi-part questions. I was floored.

    As for homophones, how do you (human) recognize them. Well you parse the logical context. If you are doing single word dictation homophones will always be a problem but for queries there's context. And the demo shows this thing can handle some staggering conditional contexts and long phrases. So I would guess that if your query is not ambiguous in the use of the word Waze, then this thing is approachi8ng a level where it will indeed get the right homophone.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  11. Re:pretty weak by GameboyRMH · · Score: 3, Insightful

    If that's true, why don't we already have programs that can make sense of human questions like this in text form?

    --
    "When information is power, privacy is freedom" - Jah-Wren Ryel
  12. Re:Google's send a text was useless. by Overzeetop · · Score: 3, Interesting

    It also only works with casual conversation.

    I tried replying to a work text with something like "It's okay to use a W12x14 in place of the C section. Just make sure that it's AISC A992 grade 50" What came out was unusable, while "yo, bitch, put the dinner on the table I'll be home in 5" was transcribed verbatim. Thank goodness I had the same problem with voice send or I would have been picking up McDonalds on my way to sleep with the dog.

    Actually, it really needs to automatically read it back to you, otherwise you have to read what it typed - and that defeats the purpose of being voice activated if you're driving.

    --
    Is it just my observation, or are there way too many stupid people in the world?
  13. A more honest title by r1348 · · Score: 5, Insightful

    "Please buy us out!"

  14. as a sound hound engineer, i can elaborate. by nimbius · · Score: 4, Funny

    Gentlemen it cannot be understated just how morose and purile our competitors are. When gazing into the sound runes to build our auditory stage of power and wisdom to obey your every utterance, we ensure the glyphs we've created in the language our tribes wrote millennia ago are in fact purified in the basking glow. this glow, which emanates from the third eyes of our laureate engineering continuum is a holy projection of the very notion of every sound that could be or has ever been uttered from the mouths of mankind. Siri, the cumbersome blind shitlord of the tortured mac user, is no more a competitor to our brand than an idle pebble on a playground. Google itself, we have determined through our pure truth, is to sound and hounds no more distinguished than a window sucking illiterate toddler mumbling nonsense in the corner of a cut rate kindergarten in a rough side of town.

    --
    Good people go to bed earlier.
  15. Some "demo." Not. by dpbsmith · · Score: 2

    When I saw that there was a demo, I figured it meant I would get to dictate a voice question and have SoundHound answer it.

    Watch a video? That isn't a demo. If all you can do is watch a prepared video, nothing has been demonstrated at all.

    You might as well say Maelzel gave a "demo" of his mechanical chess player. In a non-interactive video, you don't even know for sure it's a machine answering the question or a little man hidden in the cabinet.