Voice Is the Next Big Platform, But Amazon Already Owns It (backchannel.com)
Six million homes already have an Amazon device with it Alexa voice assistant -- about 5% of all households. But Backchannel argues that Amazon is already dominating the race to become the operating system for future voice-activated devices, with Forrester tech analyst James McQuivey pointing out that "having microphones in your environment is a lot more convenient than pulling out your phone."
The Alexa-enabled Echo is a true unicorn, one of those rare products that arrives every few years and fundamentally changes the way we live... After years of false starts, voice interface will finally creep into the mainstream as more people purchase voice-enabled speakers and other gadgets, and as the tech that powers voice starts to improve.
Despite competition from Google Home, and a rumored "Home Hub" from Microsoft, Amazon "has a two-year jump on its competition, having first introduced the Echo speaker in November 2014," notes the article, adding that Amazon also "opened its platform early to third-party developers." (Alexa now has more than 5,000 "skills".) They argue that Amazon is already winning the war of the operating systems by familiarizing consumers with "a new computing interface -- a voice devoid of a screen -- that will eventually grow to be more ubiquitous and more useful than our smartphones... Soon, you'll speak your wants into the air -- anywhere -- and a woman's warm voice with a mid-Atlantic accent will talk back to you, ready to fulfill your commands."
Despite competition from Google Home, and a rumored "Home Hub" from Microsoft, Amazon "has a two-year jump on its competition, having first introduced the Echo speaker in November 2014," notes the article, adding that Amazon also "opened its platform early to third-party developers." (Alexa now has more than 5,000 "skills".) They argue that Amazon is already winning the war of the operating systems by familiarizing consumers with "a new computing interface -- a voice devoid of a screen -- that will eventually grow to be more ubiquitous and more useful than our smartphones... Soon, you'll speak your wants into the air -- anywhere -- and a woman's warm voice with a mid-Atlantic accent will talk back to you, ready to fulfill your commands."
While in general I like the idea of a woman fulfilling my every command, I'm not sure it's worth it if she's constantly keeping tabs on me.
Some remote server listening in on everything I say, filtering every word, analyzing each sentence, etc.
Say one wrong thing, and the appropriate authorities are automatically informed and dispatched automatically. Tax evasion? IRS shows up at your door. Diesel fuel and fertilizer? FBI. Feel like killing your manager who's been driving you nuts all week? Local police.
Sign me up.
I don't understand why none of this stuff operates locally. It's always some remote server in the cloud. I remember having IBM ViaVoice (back then I think it was called "Voice Type Dictation" or "SimplySpeaking") on my goddam Pentium 75mhz computer. After about an hour of training, it would nail mostly everything I said. I find it incredibly difficult to believe that we don't have the hardware resources necessary to perform local speech-to-text and text processing inside your house without ever touching the internet.
Six million Alexa installs... compared to?
A billion Apple devices with Siri... http://www.theverge.com/2016/1...
Uh, who owns it again?
I have no desire to talk to my devices and I definitely don't want them listening to me either.
I spent about 5 minutes playing with Okay Google on my phone and it wasn't very good and about 6 months later it finally responded to some music I was listening to and I realized I had never turned it off.
And it really pisses me off when I am going through some voice prompt system and I can't just press a number for my response - it insists on a voice response. No, we don't speak the same language and your voice recognition system sucks.
I also was very resistant to using a mouse and I also keep a pen and note paper in my desk.
I was wrong about mice I guess - they are actually useful.
But I see know use in these Alexa thingies. I could see getting a sarcastic parody device though. "Hey, Alexis, what's the weather like today?"
"Look out the window, you moron! It's December. It's probably cold. Either that or it's very cold. It might even be snowing!"
Just thinking of some of the commercials I've seen....
1) Alexa, turn off the lights. Okay, haven't we had this technology ever since the Clapper was a thing? Clap On! Clap Off!
2) Alexa, order more tape. Okay, right - like I order so much tape that Alexa knows what brand I buy, what kind I need and I'm not even concerned at all about the price because of course I'm going to get it from Amazon.
3) Alexa, what's the weather like in Miami? If I really cared, I could easily look that up on the internet.
This doesn't even pass the "Wow factor" test let alone the "do I need or even want it?" test.
And I'd be willing to bet that within 5 minutes of getting one I'd be going all Samuel Jackson on it. "English, motherfucker. Do you speak it?"
"Say 'what' one more time! I dare you!"
(read in a woman's warm voice with a mid-Atlantic accent) ...and your computer will listen to everything in mic range. No need for that activity light on the mic/camera; it was operated by proprietary (read: always untrustworthy) software to begin with, and wasn't present on trackers (a more honest name for the devices also known as cell phones, mobile phones). You'll come to expect omnipresent listening, ostensibly waiting for you to give the command to signal that the computer should do something for you so you feel like you're in control. But in reality your computer has been doing something for so many proprietors all along—letting an uncountable number of parties spy on you. Because you brought these devices and services into your home, your car, and your workplace. Revel in the convenience of never really knowing if you're alone.
And don't worry: they're not spying on you for your safety. The spying "feature" works on your tracker, your home computers, and various needlessly Internet-enabled devices like your next refrigerator, a child's toy, a lightbulb socket, and more.
Digital Citizen
I live in Berlin. When I explain voice platforms to people, I roughly say: "in former times, they came into your flat, installed mics and even fixed the wallpaper whenever cabling was necessary. When you were back in, all was done and clean. *All costs where taken up by the state*".
These days you gotta pay for it. And you gotta fix the cabling mess yourself. Now tell me Socialism was worse!
Google won.
Google knows where I live, work, where airport is if I travel, what flight I am, when restaurants in my area close etc.
All the geeks in my IT department say OK Google when does X close? Or OK Google how far is X when looking at traffic while we drive. Amazon already lost and I see no value in such a device. Our phones know all the information based on habits and can even track traffic
http://saveie6.com/
Microsoft is still trying to live in its halcyon days when it seemed Microsoft could kill competition's products just by announcing that they had a similar product in beta.
.
If it weren't for Microsoft's stranglehold on corporate computing, Microsoft would have been a footnote by now...
This battle has already been lost. Cameras are everywhere and growing in number. Everything you do or say on the internet is parsed now. Your data is stolen on a regular basis. I've had Amazon Echo since the beginning and it is integrating itself into my life skill by skill. What's the weather? What's my commute time? What's on my calendar? Put this on my calendar. Buy this; buy that. (I have a mountain of TP sent by just telling her to buy it since it is already on my list, for example) Music. Lights control. Heat and air. And more skills are coming. She learns things using AI. When it comes to my car I'll have it even better. Put down that crank mentality and get a car with a starter.
E Proelio Veritas.
I don't understand why none of this stuff operates locally. It's always some remote server in the cloud. I remember having IBM ViaVoice (back then I think it was called "Voice Type Dictation" or "SimplySpeaking") on my goddam Pentium 75mhz computer. After about an hour of training, it would nail mostly everything I said. I find it incredibly difficult to believe that we don't have the hardware resources necessary to perform local speech-to-text and text processing inside your house without ever touching the internet.
The problems are scaling it up and the finer small details.
Regarding speech :
Modern offline text-to-speech technology is able to handle about 95% accuracy. (Being able to feed back based on past context to tell which homophone makes more sense, etc.) /. and especially outside of steno communities), they can mostly speak what they want and only fix here and there (only a single word every 20. Or about a word every 2-3 sentences).
- Which is damn cool already (it's only 1 in 20 words that need to be fixed ! Fucking impressive !!!)
- And is pretty useful to dictate toughs for those people who speak faster than they type (i.e.: most random joe six-pack outside
- But that's completely useless on the scale of things which are required for Siri- / Alexa- / Cortana- / Whatever- type of constant speech flux of commands. The point is to completely do away with keyboard and mouse. Not to have to pull out a keyboard (or pull out your smartphone out of your pocket) to correct every third sentence you speak to your home assistant.
The only practical application would be speaking in robotic rigid sentences. "Military-type radio speak" rigidity
(Strict word ordering: "[name], [order: [verb] [noun] ]". Fixed protocols : AI should ack what it understood and ask for confirmation "[user], you ask me to [verb] the [noun] ?", and user should confirm/correct "Yes do it [=fixed sentence] / No [=fixed sentence], [followed by new order]")
That is the kind of speech protocol that leaves very few ambiguity and risks of error (that's why it's used by military, law enforcement, catastrophe responders, or simply people working outdoor with very noisy radio conditions - ski teacher of a club spread accross mountains in my personnal experience).
That could work nearly flawlessly with modern tech.
But it is very far from the "having a casual discussion with your assistant" experience that most companies are wanting to sell.
To reach that level of fluent conversation, current experience shows (100% fully autonomous real-time text subtitling, 100% fully autonomous real-time translation, etc.) that you needs several orders magnitude more accuracy (think 99.9% accuracy. Only one missed word every thousand. Or in practice an error every day or so). And due to the law of diminishing returns, that means fuck-tons more of processing power. Several data-centers worth of processing in your basement.
(Don't believe me ? Look at youtube auto-generated subtitles. And Google certainly throws more processing power at them than simply a desktop computer).
And all the above is only about *parsing* the speech (i.e.: getting the speech-to-text accurate enough). Then you need to make *sense* out of the speech.
Again, with modern technology, making the system react to a bunch of preset command is trivial (the kind where you write a plug-in to get new commands supported) and could probably be handled on raspberry pi.
But again the things that these companies are trying to sell to random users are much more complex : "Having a natural conversation with your assistant".
That require three things : ...coupled with analysis of reject / mis-interpreted command... (most probably by huma
- tons more of processing (good bye, raspberry pi)
- tons more of reference data (much more than a few commands that the user has custom pre-configured)
- fuck ton of data gathering... (recording every command spoken by every user)
-
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]