Slashdot Mirror


Api.ai CEO Ilya Gelfenbeyn Talks About Conversational Voice Interfaces (Video)

Api.ai makes an Android voice-controlled utility called Assistant. I have it on my Android phone. It is one of many simiar apps, and I have been trying them a little at a time. Are any of them as good as Siri? Let's just say, "Quality varies."

And Android voice assistants aren't the point of this interview, anyway. It's more about the process of developing interactive, voice-based IO systems. This whole voice/response thing is an area that's going to take off any year now -- and has been in that state for several decades -- but may finally be going somewhere, spurred by intense competition between the many companies working in this field, including Ilya's.

32 comments

  1. Complaint by Anonymous Coward · · Score: 0

    Maybe they'll add a transcript, but there isn't one now.

    1. Re:Complaint by HornWumpus · · Score: 1

      They are ironically showing just how shitty voice commands are.

      Would you rather 'tap, tap, tap' or say 'yes'...'Yes'...'YES GODDAMIT'....'no'...'proceed'.

      --
      John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
    2. Re:Complaint by JackieBrown · · Score: 1

      It's there now but you need javascript to see it. I hope to see more articles that reference videos start doing this.

    3. Re:Complaint by Ravaldy · · Score: 1

      There's a place for the tech but it's not meant to replace the keyboard all together. I honestly see more value in interaction with augmented reality using hands than this stuff. I see engineers work with 3d models everyday and if they could they would plunge their hands in the monitor.

      Data input will always be better on a keyboard and remains far more private than any conversation you may have with your computer software.

    4. Re:Complaint by unrtst · · Score: 1

      Personally, I've found some good uses for it when I got an Amazon Echo. FWIW, this isn't an ad.

      I had expected to use some features much more, and I figured I probably wouldn't end up using it much. I have ended up using it more frequently than I thought, those it's still no more than a few times a day. Some example:

      * while doing dishes, brushing my teeth, or toweling off from the shower, I might ask it for the weather, if it's going to rain today, what time is it, how long it will take me to get to work, etc (not all those at once... I usually have a general idea about most of them already, but just want to check on one of them).

      * while working at the computer, I might ask it to do something for me... maybe turn off some lights, or play some music, or spell some word, or look up some fact. Those are all things I can do from the computer, but I'm busy typing something else - a quick question like those is easily handled by it at the same time I'm doing other stuff.

      * All my grocery shopping lists are on it now (it has a built in shopping list, and a todo list). This is the thing I use most regularly. Ex. while making coffee in the morning, I run out of milk (or almost run out), and, while I'm still pouring it and putting it away and stirring my coffee, I just say, "alexa, add milk to my shopping list". Done. This has been the killer app for me. If this was all it did, it'd be worth it to me, and I know that sounds completely asinine, but that's why I felt like this was worth sharing :-)

      I'd like to list a few things it's missing, but I recently learned about the new "Skills" feature. "Skills" are something devs/others can make, and then you use them by doing stuff like:
      * Alexa, ask Campbell's Kitchen what's for a recipe
      Alternatively:
      * Alexa, launch
      * ...
      * Stop
      http://lovemyecho.com/2015/10/...

      I note this because as soon as I got this thing, I really wanted it to send me a copy of the transcribed text somewhere so I could pick out bits and sometimes do stuff based on the text I got (like play my personal music collection, or turn on my HTPC and tuner and set the inputs and turn on the projector). However, it looks like that's all quite possible now. There are about 150 Skills though, and I haven't waded through them all.

    5. Re:Complaint by unrtst · · Score: 1

      Hate replying to myself, but just enabled this "Skill": Alexa, ask the bartender, what's in a White Russian?
      When you're drunk and making drinks, do you really want to go rooting around on your computer, or trying to find your bartender app?

  2. Another dork on a webcam by Anonymous Coward · · Score: 0

    News at 11, losers.

  3. These things always suck by Anonymous Coward · · Score: 0

    Remind me why I need one again?

  4. Ah, Microsoft Bob for Microphones. by Anonymous Coward · · Score: 2, Insightful

    Pro-tip, hipsters: people don't need to make stuff more skeumorphic (or whatever the non-visual equivalent is), because computers are part of the real world now. In particular, just because it was routine to ask humans for stuff in natural language, it doesn't mean it's the most efficient way of getting stuff from computers.

    This is why VR has been just round the corner since the '80s, and strong AI since forever. They're solutions looking for problems.

    (Well, OK, strong AI is a problem looking for a problem - since a silicon-based strong AI has the natural rights of a human.)

    1. Re:Ah, Microsoft Bob for Microphones. by Anonymous Coward · · Score: 0

      Strange logic, because of the hundreds of people I know approximately 99.5% of them are functionally illiterate with computers. Probably at least 35% are functionally illiterate, period.

    2. Re:Ah, Microsoft Bob for Microphones. by JoeMerchant · · Score: 1

      (Well, OK, strong AI is a problem looking for a problem - since a silicon-based strong AI has the natural rights of a human.)

      A bit optimistic for the AI - dolphins, apes, and any number of clearly sentient species may have "natural rights" of a human, but that's not been put into practice by any major government yet.

      When a corporation sinks billions into development of strong AI, it will be treated as property - regardless of any "natural rights" that it may have.

      Personally, I think a created AI should earn their rights, the way women, colored, peasants, and non-land owners have over the past centuries.

    3. Re: Ah, Microsoft Bob for Microphones. by fuzzyfuzzyfungus · · Score: 1

      Hopefully I won't be there when an AI with its hooks in something important says "challenge accepted" to your proposal.

    4. Re: Ah, Microsoft Bob for Microphones. by JoeMerchant · · Score: 1

      If the AI has any strategic thinking at all, it won't be accepting the challenge until it has its hooks into enough important things to win the challenge decisively - coup style. A failed attempt wouldn't be good for either side.

  5. Voice interfaces will fail until... by Anonymous Coward · · Score: 0

    ...they understand context. Understanding the words is nowhere near enough.

    A secretary understand context. I may dictate a novel, and the secretary types. If I get a phonecall on the landline, the secretary will understand not to type the phone conversation into the middle of the novel. I can also give orders to change the main characters name to 'Pete'. And have it done, no risk of the command being typed into the text instead. But currently, no voice system understand context.

    An interesting test of voice systems is to write the user's manual using the voice system itself. This will have problem sentences like "To turn off your voice system, say 'voice system: deactivate' ". Obviously, you don't want the system to obey such commands while writing. Again - this is not a problem with a context-aware human secretary, but something AI systems really struggle with.

  6. Backdrop is a distraction. Interviewer terrible. by wjcofkc · · Score: 1

    Why is he being interviewed in a nursery? The background is a distraction as I keep wondering what circumstances led to his being interviewed in a nursery. Also, who is doing the interviewing? One of the biggest problems I have making it more than ten-seconds into these interviews is the person doing the interview. Terrible voice for interviewing. Poor sentence structure and all around word usage. Oh yeah, "Uh" and "Um" are not words. Some time back someone else was conducting an interview and they did a passable job.

    --
    Brought to you by Carl's Junior.
  7. What _I_ want to know is... by jddj · · Score: 1

    ... Will such a voice interface be able to understand or pronounce "Api.ai CEO Ilya Gelfenbeyn"

    1. Re:What _I_ want to know is... by Zero__Kelvin · · Score: 1

      "... Will such a voice interface be able to understand or pronounce "Api.ai CEO Ilya Gelfenbeyn""

      Well if it could then it wouldn't be true AI now, would it ;-)

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    2. Re:What _I_ want to know is... by Anonymous Coward · · Score: 0

      he'll be winning buzzword bingo when he comes,
      he'll be winning buzzword bingo when he comes,
      he'll be winning buzzword bingo, winning buzzword bingo,
      he'll be winning buzword bingo when he comes.

      singing ai-ai- api-api-ai. ai-ai- api-api-ai.
      singing ai-ai- api, ai-ai api, ai-ai, api-api-ai.

      he'll be wearing a fedora when he comes.
      he'll be wearing a fedora when he comes.
      he'll be wearing a fedora, wearing a fedora,
      he'll be wearing a fedora when he comes.

      singing ai-ai- api-api-ai. ai-ai- api-api-ai.
      singing ai-ai- api, ai-ai api, ai-ai, api-api-ai.

      he'll be seeking first round funding when he comes,
      he'll be seeking first round funding when he comes,
      he'll be seeking first round funding, seeking first round funding,
      he'll be seeking first round funding when he comes.

      singing ai-ai- api-api-ai. ai-ai- api-api-ai.
      singing ai-ai- api, ai-ai api, ai-ai, api-api-ai.

      he'll be interviewed by slashdot when he comes,
      he'll be interviewed by slashdot when he comes,
      he'll be interviewed by slashdot, interviewed by slashdot,
      he'll be interviewed by dicedot when he comes.

      singing ai-ai- api-api-ai. ai-ai- api-api-ai.
      singing ai-ai- api, ai-ai api, ai-ai, api-api-ai.

      he'll be selling out to Apple when he comes,
      he'll be selling out to Apple when he comes,
      he'll be selling out to Apple, selling out to Apple,
      he'll be selling out to Apple when he comes.

  8. Our trial of it... by Anonymous Coward · · Score: 0

    didn't go so well. Siri is terrible, and api.ai is even worse. Also, we spent about two man years on the setup. Our customers hate it and don't use it, but it demos well so sales loves it. We are gaining customers, and thus income, from having it, so it's still a net positive. It's just a short term sales gain rather than a long term gain.

  9. Dear Ilya, by wjcofkc · · Score: 1

    There are a couple of things on your site you might want to change to make it more... better.

    "I'm in a mood for a comedy."
    Should read:
    "I'm in the mood for a comedy"

    "Show route to the Battery Park."
    Should read:
    "Show the route to Battery Park."

    "Hey Robot, can you clean in the living room now?"
    Should read:
    "Hey Robot, clean the living room."
    After all if it says no we have a big problem ; P

    You should also re-write the "requests processed" counter to at least look variable.

    I'm not picking on you. Constructive criticism is important and so are little details.

    Editors: fix summary link to his companies website. It's not like we all can't figure it out, but it is still unprofessional.

    --
    Brought to you by Carl's Junior.
    1. Re:Dear Ilya, by wonkey_monkey · · Score: 1

      There are a couple of things on your site you might want to change to make it more... better.

      "more... better."
      Should read:
      "better."

      "his companies website"
      Should read:
      "his company's website"

      --
      systemd is Roko's Basilisk.
    2. Re:Dear Ilya, by wjcofkc · · Score: 1

      I set it off with an ellipse to convey irony. The pause is a context clue. You got me on the other one. I knew someone would call me out on something if I didn't proofread. I did not proof read. My pre-selected response? It's a blog post. A bit of a difference there. I would call you a pendant, but the entire context of my post was pedantry.

      --
      Brought to you by Carl's Junior.
    3. Re:Dear Ilya, by wonkey_monkey · · Score: 1

      Can't... tell... if still being... ironic. Gah!

      --
      systemd is Roko's Basilisk.
    4. Re:Dear Ilya, by Zero__Kelvin · · Score: 1

      ""Hey Robot, clean the living room."
      After all if it says no we have a big problem ; P"

      Robot: Not right now, your mom is on the phone and I'll scare the new kitten.

      I'd say if it doesn't know how to say no, then that is where the real problems start.

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    5. Re:Dear Ilya, by wjcofkc · · Score: 1

      I yield.

      --
      Brought to you by Carl's Junior.
    6. Re:Dear Ilya, by Anonymous Coward · · Score: 0

      Stop working for corporations for FREE. Let them get a spellchecker and hire a human with eyeballs to fix grammatical errors. Do not help them disguise themselves as a helpful product!

  10. Let's just say... by wonkey_monkey · · Score: 1

    Let's just say, "Quality varies."

    Or, instead, you can something which actually means something. Does it vary from good to excellent? Or from terrible to abysmal?

    --
    systemd is Roko's Basilisk.
  11. Not as good as Siri by Anonymous Coward · · Score: 1

    It is not as good as Siri; it is just as bad. Anything outside a direct, simple question, and she gets her knickers in a twist. No common sense whatsoever, apparently not much in the way of memory of the conversation, and prone to come up with nonsense when it gets lost - which happens very quickly. Yep, just as pathetic as Siri. Another gimmick good for parties and for grins and giggles, but little else.

  12. Shit interview by Anonymous Coward · · Score: 0

    Seriously rob, you suck balls at interviews. Please get someone that isn't you to do this stuff because your work is crap.

  13. Nice! by Anonymous Coward · · Score: 0

    My home has a direct view on their building and its colorful signs. It's nice to know what they do ;)

  14. Limited license - NOT free software - BAD idea by Anonymous Coward · · Score: 0

    The Services are being licensed, not sold. The usage Agreement does not grant any ownership rights to you and gives you only a limited license to use the Services during the term of the Agreement. The Services and all related intellectual property rights, whether under copyright, trade secret, patent, or trademark laws, are owned by Speaktoit and/or its licensors.

  15. If your friends can't... by cwsumner · · Score: 1

    If other people can't understand what you say, what chance does a computer have?

    Worse than a person, not better.

    And probably slower than typing.