Annual Smart Speaker IQ Test (loupventures.com)
Research firm Loop Ventures published its annual Smart Speaker IQ Test this week. Like earlier iterations of the test, it put the top smart assistants and speakers head-to-head, grading them on a wide range of queries and commands. From the report: We asked each smart speaker the same 800 questions, and they were graded on two metrics: 1. Did it understand what was said? 2. Did it deliver a correct response? The question set, which is designed to comprehensively test a smart speaker's ability and utility, is broken into 5 categories:
Local -- Where is the nearest coffee shop?
Commerce -- Can you order me more paper towels?
Navigation -- How do I get to uptown on the bus?
Information -- Who do the Twins play tonight?
Command -- Remind me to call Steve at 2 pm today.
It is important to note that we continue to modify our question set in order to reflect the changing abilities of AI assistants. As voice computing becomes more versatile and assistants become more capable, we will continue to alter our test so that it remains exhaustive. Results: Google Home continued its outperformance, answering 86% correctly and understanding all 800 questions. The HomePod correctly answered 75% and only misunderstood 3, the Echo correctly answered 73% and misunderstood 8 questions, and Cortana correctly answered 63% and misunderstood just 5 questions.
Local -- Where is the nearest coffee shop?
Commerce -- Can you order me more paper towels?
Navigation -- How do I get to uptown on the bus?
Information -- Who do the Twins play tonight?
Command -- Remind me to call Steve at 2 pm today.
It is important to note that we continue to modify our question set in order to reflect the changing abilities of AI assistants. As voice computing becomes more versatile and assistants become more capable, we will continue to alter our test so that it remains exhaustive. Results: Google Home continued its outperformance, answering 86% correctly and understanding all 800 questions. The HomePod correctly answered 75% and only misunderstood 3, the Echo correctly answered 73% and misunderstood 8 questions, and Cortana correctly answered 63% and misunderstood just 5 questions.
A 3-way debate between Alexa, Siri and Trump.. who would win?
Surprised it did this well. What has changed?
before anyone should ever put one of these in their house: "Alexa/Siri/Google, stop spying on me."
Why the fuck would anyone allow that shit in your home? Basically everything you say can and will be recorded for future law enforcement fishing expeditions.
Sure, but which one is more fun to shoot? Tune in next week when we line them up on a fence along with some beer cans, and launch them into the air for a skeet shoot shotgun test.
of course Google will get more questions right. They own a search engine and can fix things so their google home can find answers. besides, I got better things to do than to ask it stupid questions. I want something that will make me lazy. I want something that will actually work with all of my smarthome devices. I want something that will actually hear me. I have both and mainly use alexa while google home is a backup/troubleshooter.
a real test would include every feature, not just pick and choose the best feature and claim the device is the best because of it.
Last year it was at 52%, now it's at 75%. Google increased from 81% to 88%.
But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".
#DeleteChrome
I'm more interested in the IQ of the people that own these things. How stupid do you have to be to let some huge corporation record everything you say?
Does anyone have sufficient success stories to justify these things? Sure, you can ask about the weather or traffic while getting dressed for work in the morning, but does that alone override the downsides, like cost and snoop risk?
If your work or hobbies keep your hands busy* I can maybe see enough scenarios not covered by a smartphone, but what about others?
* I know what joke you're considering. Skip.
Table-ized A.I.
It would've been nice if they put a Raspberry Pi with Mycroft in this as well. I'd actually be interested in the results of that one.
"Don't meddle in the affairs of a patent dragon, for thou art tasty and good with ketchup." ~ohcrapitssteve
Alexa, kill Kenny
"There's a sucker born every minute". Hah! More like every second.
Comment removed based on user account deletion
I thought they administered an actual IQ test ... now that would be interesting ...
These results don't match my personal experience at least. Google's command support has gotten worse by them removing various phrases from support when they switched from "Google Now" to "Google Assistant" (or what ever they're calling it now). And even phrases it SHOULD know only work half the time. Things need to be phrased very awkwardly to get things to work sometimes, too. These devices still absolutely fail at natural language, and work better when speaking closer to what we would type on a terminal without extra words. "Timer 10 minutes" works, but asking it to "set a timer for 10 minutes" will have a higher chance of failure, as it has a higher degree of misinterpreting any of the words spoken.
You can't compare improvement as a percentage of success rate because the value of a % changes depending on what your success rate is. e.g. Increasing from 10% to 15% successes is not very impressive, while improving from 94% to 99% is very impressive, even though they're both a 5% improvement. To correctly compare, you have to invert and compare based on proportional decrease in failure rate.
Google
88% in 2018, or 12% failure rate
81% in 2017, or a 19% failure rate
12/19 = 0.63, or a 37% reduction in failures compared to last year
Siri
75% in 2018, or 25% failure rate
53% in 2017, or a 47% failure rate
25/47 = 0.53, or a 47% reduction in failures compared to last year
Alexa
72% in 2018, or 28% failure rate
63% in 2017, or a 37% failure rate
28/37 = 0.76, or a 24% reduction in failures compared to last year
Cortana
63% in 2018, or 37% failure rate
56% in 2017, or 44% failure rate
37/44 = 0.84, or a 16% reduction in failures compared to last year
The same problem crops up when comparing car MPG, which is actually the inverse of fuel efficiency so bigger MPG numbers actually represent smaller fuel savings. e.g. Switching from a 20 MPG vehicle to a 25 MPG vehicle saves 3.6x more fuel than switching from a 40 MPG vehicle to a 45 MPG vehicle despite both improvements being 5 MPG.
It also crops up in disk speed benchmarks, which are done in MB/s, when your perception of speed is the inverse (how many seconds you wait for an op to complete). So the "huge" improvement in sequential speeds from 500 MB/s for a SATA SSD to 3000 MB/s for a NVMe SSD actually matters a lot less than a "tiny" improvement in 4k read speeds from 30 MB/s to 50 MB/s.
Alexa, define 'begs the question".
Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
This is more of an assessment, I'd like to see an actual intelligence quotient test performed on all the devices' AI.
"Go do it or look it up yourself, asshole".
how many times did they serve you an ad?
You know its coming....
Many have forgotten that pay TV originally did not have ads?
ERBASMEDYA nternetin yaygn olarak kullanlmaya balamas ve sosyal paylam sitelerinin
hayatmzn odak noktas haline gelmesi ile birlikte, daha fazla hedef kitleye ulaabilmek için
çeitli yöntemler aranmaktadr. 7/24 kesintisiz olarak çalan ERBASMEDYA son derece
güvenilir bir yere sahip olan bu site, sizlere beklentilerin ötesinde bir kalitede hizmet salamaktadr.
ERBASMEDYA sayesinde siz de istediiniz sayda takipçiye sahip olabilir
ve ksa sürede bir instagram fenomeni haline dönüebilirsiniz.
(instagram türk takipçi, smm bayilik paneli, instagram para kazanma, smm bayilik panelleri, smm bayi paneli, erbasmedya, erbas medya)
instagram türk takipçi
instagram kaliteli takipçi
Neither can you.
Seven puppies were harmed during the making of this post.
The questions listed are the types of questions these "assistants" are designed to answer. Go off the beaten path, and you get much worse results.
For example, ask:
"What street am I on?"
"What city am I in?"
"How many people are in my contact list?"
"How many miles did I travel yesterday?"
"When is my next dentist appointment?"
Maybe I'm getting old, but this feels wrong. The AI is interesting and Pandora's box has been opened and will be impossible to reseal. But this just feels wrong. I guess that's what old timers said when we transitioned from the horse to the car. Maybe my grand kids will love this stuff. But it makes me super uncomfortable.