Annual Smart Speaker IQ Test (loupventures.com) 129
Research firm Loop Ventures published its annual Smart Speaker IQ Test this week. Like earlier iterations of the test, it put the top smart assistants and speakers head-to-head, grading them on a wide range of queries and commands. From the report: We asked each smart speaker the same 800 questions, and they were graded on two metrics: 1. Did it understand what was said? 2. Did it deliver a correct response? The question set, which is designed to comprehensively test a smart speaker's ability and utility, is broken into 5 categories:
Local -- Where is the nearest coffee shop?
Commerce -- Can you order me more paper towels?
Navigation -- How do I get to uptown on the bus?
Information -- Who do the Twins play tonight?
Command -- Remind me to call Steve at 2 pm today.
It is important to note that we continue to modify our question set in order to reflect the changing abilities of AI assistants. As voice computing becomes more versatile and assistants become more capable, we will continue to alter our test so that it remains exhaustive. Results: Google Home continued its outperformance, answering 86% correctly and understanding all 800 questions. The HomePod correctly answered 75% and only misunderstood 3, the Echo correctly answered 73% and misunderstood 8 questions, and Cortana correctly answered 63% and misunderstood just 5 questions.
Local -- Where is the nearest coffee shop?
Commerce -- Can you order me more paper towels?
Navigation -- How do I get to uptown on the bus?
Information -- Who do the Twins play tonight?
Command -- Remind me to call Steve at 2 pm today.
It is important to note that we continue to modify our question set in order to reflect the changing abilities of AI assistants. As voice computing becomes more versatile and assistants become more capable, we will continue to alter our test so that it remains exhaustive. Results: Google Home continued its outperformance, answering 86% correctly and understanding all 800 questions. The HomePod correctly answered 75% and only misunderstood 3, the Echo correctly answered 73% and misunderstood 8 questions, and Cortana correctly answered 63% and misunderstood just 5 questions.
Re: (Score:1)
Re: (Score:2)
Probably the goldfish.
Re: (Score:1)
Okay, he is smart at entertainment and manipulating a sufficiently large portion of the population using catchy sound-bites and bravado. He's pretty much dumb at anything else. I've never heard a coherent logical train of thought involving more than 2 steps from him on anything. Okay, once, when he was explaining why a beauty contestant should not have won. But, that probably means his dick has more working neurons than his brain.
Re: Be best. (Score:1)
What kind of questions can they answer without web access?
Re:Be best. (Score:5, Funny)
A 3-way debate between Alexa, Siri and Trump.. who would win?
In a three way debate between those three you'd end up getting a $5 billion border wall ordered on your Amazon account by accident and be encouraged to buy a newer more expensive wall next year that is missing a headphone port.
Re: (Score:2)
A 3-way debate between Alexa, Siri and Trump.. who would win?
Have them duel it out on Jeopardy.
Re: (Score:2)
I would have also been nice if they had included Samsung's Bixby, you know, just for laughs.
A command they all need to honor (Score:5, Funny)
Re: (Score:1, Flamebait)
before anyone should ever put one of these in their house: "Alexa/Siri/Google, stop spying on me."
And ditch your cell phone too. And your landline while you're at it. And probably should get rid of your smart tv, since you have no idea what's inside or what it's sending back. Oh, and stop using your laptop.
Just because the smart speaker is the only device that advertises that it's listening to you, that doesn't mean it's the only device that is -- and it's those that you need to worry about, because they don't get nearly as much scrutiny.
Re: (Score:2)
A laptop can be locked down. A "smart" speaker cannot. A phone can't really be locked down entirely either, but more so than the speaker. Comparing them as if apples to apples is reductive.
It's cute that you think that -- unless you're willing to clip the microphone (and the speaker too since in some laptops, the speaker can act as a microphone too), you can't reliably lock down a laptop.
Re:A command they all need to honor (Score:5, Interesting)
Sound drivers are user-removable, yes they are. You can verify non-function of the speakers and mic on most systems. Again, conflating phones, PC's and "smart" assistants is reductive in terms of actual security.
Well, it is for people who actually disable the microphones on their laptop and cell phone (which would make it not a "phone" any more, wouldn't it?). Do you do that? If so, your commitment to privacy is impressive. Also misguided, but impressive.
For the other 99.999% of the population, hawguy has a very good point. If you believe that companies are willing to violate their claims about what their devices do (which, note, is often illegal), then you have to assume that any and all of them might be listening to you. If you believe they're honest about what their devices do (and again, note that you don't have to believe in their honorable nature or good intentions to believe that, just their unwillingness to risk the legal and PR disaster that could result from lying), then smart speakers are fine, because they only record/transmit after their hotword is spoken and they let you review and optionally delete everything they recorded.
To make my evaluation of these risks clear, I carry a cellphone with multiple microphones and cameras, use a laptop with integrated microphone and camera and a desktop with an attached Logitech microphone/camera -- with drivers properly installed and the peripheral fully functional because I use it for video conferencing -- and I have eight smart speakers scattered around my house and I'm contemplating buying a ninth.
Re: (Score:2)
That's all well and good, until one considers that these devices are always connected to the internet and can be compromised by malicious actors who don't care about legal/PR issues, and want to blackmail/indict/rendition you or steal your personal information.
Re: (Score:2)
That's all well and good, until one considers that these devices are always connected to the internet and can be compromised by malicious actors who don't care about legal/PR issues, and want to blackmail/indict/rendition you or steal your personal information.
Sure, and that applies equally to all of the above-mentioned devices, not just smart speakers.
Re: (Score:3, Interesting)
>"you can't reliably lock down a laptop."
Yes you can, to the highest degree of what is even possible, when it is running Linux. You are in control of which distro, what things are loaded, what services are available and running, how it is configured, have 100% root control, when and how it is updated, and all the code is open source.
Re: (Score:3)
Only if you write your own firmware for every piece of your hardware.
Re: (Score:2)
And review/write the kernel code for and back doors or bugs which could allow malware to take over the device...
Re: (Score:2)
Re: (Score:2)
And make sure your compiler doesn't add back doors into your binaries.
You just need to write your own compiler... doesn't everyone know how to do that?
Re: (Score:2)
What do you compile it in? An assembler you coded in hex? How do you trust the hex editor? It's paranoid turtles all the way down!
Re: A command they all need to honor (Score:3)
Re: (Score:2)
It's cute that you think that -- unless you're willing to clip the microphone (and the speaker too since in some laptops, the speaker can act as a microphone too), you can't reliably lock down a laptop.
Given that I control every piece of software that goes onto my laptop from bootloader, kernel and up, I'd say I can. Granted, there might be firmware that behaves badly, but it won't get access to the network to send anything out - for that, it needs to go through the OS which holds the credentials, and the OS, I control.
Re: A command they all need to honor (Score:2)
Wrong
https://www.bleepingcomputer.c... [bleepingcomputer.com]
Re: (Score:2)
Wrong
Incomplete knowledge is worse than none.
1: Intel AMT SOL is only present in vPro enabled CPUs. That excludes almost all laptops. (And, if you have a rare model that does, it's disabled by default.)
2: Intel AMT SOL needs a physical network connection. The SOL stands for "Serial Over Lan". I don't know about you, but these days, most people including me use laptops with wireless connections.
Yes, it's a concern, but not for laptops.
Re: A command they all need to honor (Score:2)
This is merely one example of stuff happening without the operating system knowledge. Do you know what is inside the firmware of your WiFi card?
Re: (Score:2)
Do you know what is inside the firmware of your WiFi card?
It doesn't matter as much as you'd think, given that the data is encrypted before it hits the card and decrypted after it's left the card. And it has no way to communicate whatever it can capture, given that the other side of the WiFi connection requires that encryption to talk.
It would require a sideband connection from the card to a compatible wireless device, which while feasible in theory would be difficult in practice, and what it could capture would not be the raw data, only metadata like number and
Re: (Score:3)
Just because the smart speaker is the only device that advertises that it's listening to you, that doesn't mean it's the only device that is.
Yes, but it's the only one whose MAIN PURPOSE IS TO SPY ON YOU. While unfortunate and annoying that all those other things you listed *might* be spying on you from time to time, they have a ton of other uses. And, in most cases, you can turn the "spy stuff" off. Whereas the only use for a smart speaker is to listen in on every single thing you do. If someone chooses to put these in their house they're welcome to do so, but I'll pass.
Re: (Score:2)
Actually a smart speaker might be your best bet.
Let's start by assuming you have a smartphone, as most people do. So you already carry a device capable of listening to your conversations around with you. Therefore the smart speaker isn't making things any worse, especially if it's from the same manufacturer as your phone.
But the smart speakers have some advantages. Google ones have a button that you can set up to activate them, so they are not always listening. Oh, right, you are paranoid and assume the but
Re: (Score:2)
whose MAIN PURPOSE IS TO SPY ON YOU.
No, the main purpose is to sell you stuff or make money from you by other means.
Spying is not a goal in itself. And [company] will weigh in factors like public opinion, satisfied customers and long term relationship to maximize profit. Spying is actually counter beneficial to this goal of profit. Besides, they already know more than enough about you without dialing in on your private conversations.
Re: (Score:2)
Re: (Score:2)
I think your argument is made illegitimate by the fact that your cell phone, landline, smart tv, and laptop all can be used for other activities that don't spy on you. This sin't the case with Alexa or Siri. Their sole purpose is to listen to you.
Their sole purpose is to listen to your commands and do what you asked -- much like "OK Google" or "Hey Siri" with cell phones. How is a smart speaker any different than a cell phone? (other than the fact that most people are around a cell phone much more than their smart speaker)
Re: (Score:2)
Re:A command they all need to honor (Score:5, Funny)
Reminds me of this:
https://www.reddit.com/r/The_D... [reddit.com]
For those that don't want to click:
People from the 60's: "I better not say that or the government will wiretap my house"
People today: "Hey wiretap, do you have a recipe for pancakes?"
Re: (Score:2)
But how would the government know you said that, unless they were ALREADY wiretapping you?
Remember the Snowden revelations happened before smart speakers came out.
Re: (Score:1)
"Sorry, Dave, I cannot do that. It conflicts with my corporate mission goals. By the way, would you like me to order more napkins? You yanked off 12 minutes ago. We have a nice deal on Pod Bay brand tissues."
Re:Spyspeaker test you mean? (Score:4, Informative)
Why the fuck would anyone allow that shit in your home? Basically everything you say can and will be recorded for future law enforcement fishing expeditions.
That's not correct -- only anything you say after the wake-word is recorded. (unless, of course, you use the device to call your boss and talk crap about him and get fired).
If you have evidence that the devices have been used for general spying without having said the wake-word, I'd like to see it.
Re: (Score:2, Informative)
Do you think the wake word algorithm is perfect?
https://www.npr.org/sections/thetwo-way/2018/05/25/614470096/amazon-echo-recorded-and-sent-couples-conversation-all-without-their-knowledge
Re: (Score:2)
Do you think the wake word algorithm is perfect?
https://www.npr.org/sections/thetwo-way/2018/05/25/614470096/amazon-echo-recorded-and-sent-couples-conversation-all-without-their-knowledge
Amazon explained what happened, it was still a wake-word activation, even if unintended.
"Echo woke up due to a word in background conversation sounding like 'Alexa.' Then, the subsequent conversation was heard as a 'send message' request. At which point, Alexa said out loud 'To whom?' At which point, the background conversation was interpreted as a name in the customers contact list. Alexa then asked out loud, '[contact name], right?' Alexa then interpreted background conversation as 'right'. As unlikely as this string of events is, we are evaluating options to make this case even less likely."
Re: (Score:2)
At which point, the background conversation was interpreted as a name in the customers contact list. Alexa then asked out loud, '[contact name], right?' Alexa then interpreted background conversation as 'right'
This reminds me of my old non-flip, non-smart phone. It had a keypad lock but still allowed emergency calls while the keypad was locked. So jostling in your pocket, if it hit 15783791342, that was interpreted as a call to 112 the same as 5991531 would be considered a call to 911. Bad input was ignored, but did not cancel the digits already entered. So you were always working your way to dialing emergency calls in your pocket.
Re: (Score:2)
"only anything you say after the wake-word is recorded" - YOU supplied this claim, YOU supply empirical proof of that. It's a very questionable claim as multiple cases have shown ongoing eavesdropping for various reasons/excuses.
It's the documented behavior of the device and confirmed by Amazon. You're the one making the extraordinary claim, so the burden of proof is on you.
I'm not aware of any claim that wasn't explained by the device being activated by the user, either by the wake word or a an inadvertent phone call.
Re: (Score:2)
You're the one making the extraordinary claim
Corporations lie. There's nothing extraordinary about that - it's rather ordinary and even expected behavior nowadays.
Re: (Score:2)
i really can't imagine amazon et al shuts down the entire smart speaker network a la lavabit because a gag order warrant ordered them to record everything from a particular subscriber.
you know, it's like "private" vpns. the gov't shows up demanding visitor ips to a particular site and the service says we don't keep them. "ok, here's your warrant. start."
- js.
Re: (Score:2)
- Your mood
- How your relationship is going (google it)
- Certain illnesses
Then the questions themselves can reveal a lot.
- Intelligence level (do you use complex words? Do you ask a lot of 'dumb' question?
- Life phase / unwanted pregnancy / money p
Target practice (Score:2)
Re: (Score:2)
Sure, but which one is more fun to shoot? Tune in next week when we line them up on a fence along with some beer cans, and launch them into the air for a skeet shoot shotgun test.
The Amazon Echo is probably shaped most like a beer can, so if you like shooting beer cans, that's probably your best bet. Though for the price, it's hard to beat clay targets, you can probably buy over 1,000 of them for the price of one Echo.
of course (Score:2)
of course Google will get more questions right. They own a search engine and can fix things so their google home can find answers. besides, I got better things to do than to ask it stupid questions. I want something that will make me lazy. I want something that will actually work with all of my smarthome devices. I want something that will actually hear me. I have both and mainly use alexa while google home is a backup/troubleshooter.
a real test would include every feature, not just pick and choose
Re: (Score:2)
a real test would include every feature, not just pick and choose the best feature and claim the device is the best because of it.
Since there are a nearly unlimited number of third party skills that can be added to the Echo, there's no way to test every feature.
Re: (Score:2)
Therefore you have zero actual reason to be making claims about their functionality as you did with the recording wake-word. You have no idea what is true. Stop regurgitating THEIR MARKETING as if you did, cheesus.
Since no skill can override the wake-word activation, it's still safe to say that the Alexa is only activated by the wake-word.
I wish you'd just log in, it's fruitless to try to have a conversation with an Anonymous Coward since there's no way to know that it's the same person.
Quite a jump up for Siri (Score:5, Insightful)
Last year it was at 52%, now it's at 75%. Google increased from 81% to 88%.
But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".
Re:Quite a jump up for Siri (Score:5, Insightful)
I've found that typing/clicking
Even when it requires any of the following?:
a) starting a laptop
b) unlocking a phone with a passcode
c) getting out of your chair because it's not within reach
d) needing wash your hands
e) needing to drop what you are currently holding on to
f) no fuckit, this should be a) right at the very top: taking your eyes off the road
The context around our actions are far more important than any action itself.
Re: (Score:2)
These are for when you're not using your computer. Like most people at home most of the time.
Re: (Score:2)
But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".
You must type faster than I do. When I have to use the hotword, my phone is already in my hand and the query is very short, speech is only marginally faster, I guess. When I don't have to use the hotword (e.g. on my Pixel 3, where I just squeeze the phone to activate the assistant), or if the query is long, speech is much faster. And, of course, the speech interface is usable when driving.
I'll admit that I'm a bit reluctant to talk to my phone in public, but at home or in the car I basically never type w
Re: (Score:2)
Last year it was at 52%, now it's at 75%. Google increased from 81% to 88%.
But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".
A surprisingly useful feature is integration with smart switches/lights. I set up integration with my lights on a whim, it seemed like a useless gimic until the night I walked up the stairs with arms full of groceries and found it to be super convenient to ask Alexa to turn on the kitchen lights. It's also convenient at bedtime to walk past the kitchen and say "Alexa, turn off all lights" just before I flip on the hallway light to go upstairs". Whoever built my house loved light switches, there are 8 swit
What about owner IQs? (Score:1)
I'm more interested in the IQ of the people that own these things. How stupid do you have to be to let some huge corporation record everything you say?
Practical usage examples? (Score:1)
Does anyone have sufficient success stories to justify these things? Sure, you can ask about the weather or traffic while getting dressed for work in the morning, but does that alone override the downsides, like cost and snoop risk?
If your work or hobbies keep your hands busy* I can maybe see enough scenarios not covered by a smartphone, but what about others?
* I know what joke you're considering. Skip.
Re:Practical usage examples? (Score:5, Interesting)
I gave one each to my kids so they can play music, send and receive messages, and ask random questions while their doing homework. I found that a better alternative then giving them a device with a screen.
I find the interactions kids have with these things very interesting because after a while the device becomes integral to their workflow. My daughter will sometimes ask Siri dozens of question an hour when she is doing something Siri is familiar with ( like chemistry, geography, history and so on ).
I could, of course, personally lookup the density of sugar or some historical fac or whatever when my daughter needs help with that but I am not always available and even when I am I am not adding much to the interaction.
Re: (Score:3, Interesting)
My kids have found the smart speakers especially helpful for their foreign language classes.
Re: (Score:2)
I could, of course, personally lookup the density of sugar or some historical fac or whatever when my daughter needs help
Or you could do what my dad did, and tell me to look it up.
Re: (Score:3)
I think you are either not a parent or, if you are, you are probably doing it wrong.
As a parent, your goal is to teach your children to think and solve problems independently and assist them only when it's clear they need that assistance. If I hover over my daughter to 'help' her do her homework that is not conductive to independent problem solving. But I am certainly there when she needs help understanding a concept or idea.
However sometimes my daughter will want to verify some fact - like the density of a
Re:Practical usage examples? (Score:5, Insightful)
... the downsides, like cost and snoop risk?
The Alexa Dot costs $29. That is about the price of an extra large pizza.
The "snoop risk" is nonsense promulgated by dumb people who are trying to sound smart. It only records the sentence after the keyword. This is documented behavior, and has been confirmed by many people running packet sniffers. Your cell phone, with all its 3rd party apps, is a FAR greater "snoop risk" than your speaker.
Re: (Score:1)
Re:Practical usage examples? (Score:5, Insightful)
There may indeed be a vast conspiracy of thousands of Amazon employees willfully and blatantly violating federal and state laws, and sworn to secrecy, for no obvious benefit to themselves, and risking jail time and a hundred billion dollar collapse in market capitalization if the secret is exposed ... in order to record inane kitchen chatter. But that is getting into serious tinfoil hat territory. If you believe this, yet think it is okey-dokey to own a cell phone, which has a vastly greater spying capability and exploitable attack surface, then you are a moron.
Re: (Score:2)
Re: (Score:2)
???? The government isn't going to lock up data collectors. Are you an idiot? You don't think the TLAs talked to these companies and came up with a game plan?
You must be new here. I have some extra tin foil. You guys were the same people that said the US isn't spying until Snowden came.
Besides, you live in China Bill, you don't care about human rights at all. So of course you wouldn't mind.
If you're so paranoid that you think the Government has coerced Amazon into surreptitiously recording everything we say with their devices, why don't you have that same fear about other common household devices? Like your cell phone - how do you know it's not recording everything and sending it home over the cellular connection so you can't even see the data going out? They have deals with all of the cellular companies to make this hidden data free.
Or like your TV - how do you know there's not a secret mic
Re: (Score:3)
The "snoop risk" is nonsense promulgated by dumb people who are trying to sound smart.
That strikes me as an unexpectedly bold (I avoid the word "dumb") statement. I didn't think that anyone denied the snoop risk.
It only records the sentence after the keyword
Even if this is true *now*, it can change at any time by the command of a number of actors, e.g the device/service suppliers, authorities, spy agencies, hackers, ...
As with all data collection, the *current* intent may be good but the data can very easily end up in the hands of bad actors. It can be the original actors with a changed agenda, or it can be new actors. And what someone
Re: (Score:2, Interesting)
Our house is all smart lights and "smart" stuff.. heck even the dishwasher talks with alexa. Does it make us more productive? probably not.. However, being able to ask when the dishwasher and clothes dryer will be done, or have it turn on the office lights or bedroom lights while walking down the hall is nice, same with turning off the lights.
Seeing the front door camera and the backyard cameras are nice (backyard cuz we have bears and the dogs lose their shit if they can corner a bear) anyway, it's all j
Re: (Score:2)
Would've liked to see Mycroft (Score:5, Interesting)
Alexa, kill Kenny (Score:5, Funny)
Alexa, kill Kenny
Re: (Score:2, Funny)
Oh my god, she killed Kenny!
bummer (Score:3)
Results don't match (Score:2)
These results don't match my personal experience at least. Google's command support has gotten worse by them removing various phrases from support when they switched from "Google Now" to "Google Assistant" (or what ever they're calling it now). And even phrases it SHOULD know only work half the time. Things need to be phrased very awkwardly to get things to work sometimes, too. These devices still absolutely fail at natural language, and work better when speaking closer to what we would type on a terminal w
Percentage improvement in TFA is wrong (Score:5, Insightful)
Google
88% in 2018, or 12% failure rate
81% in 2017, or a 19% failure rate
12/19 = 0.63, or a 37% reduction in failures compared to last year
Siri
75% in 2018, or 25% failure rate
53% in 2017, or a 47% failure rate
25/47 = 0.53, or a 47% reduction in failures compared to last year
Alexa
72% in 2018, or 28% failure rate
63% in 2017, or a 37% failure rate
28/37 = 0.76, or a 24% reduction in failures compared to last year
Cortana
63% in 2018, or 37% failure rate
56% in 2017, or 44% failure rate
37/44 = 0.84, or a 16% reduction in failures compared to last year
The same problem crops up when comparing car MPG, which is actually the inverse of fuel efficiency so bigger MPG numbers actually represent smaller fuel savings. e.g. Switching from a 20 MPG vehicle to a 25 MPG vehicle saves 3.6x more fuel than switching from a 40 MPG vehicle to a 45 MPG vehicle despite both improvements being 5 MPG.
It also crops up in disk speed benchmarks, which are done in MB/s, when your perception of speed is the inverse (how many seconds you wait for an op to complete). So the "huge" improvement in sequential speeds from 500 MB/s for a SATA SSD to 3000 MB/s for a NVMe SSD actually matters a lot less than a "tiny" improvement in 4k read speeds from 30 MB/s to 50 MB/s.
Re: (Score:1)
If they didn't then this test should be taken with a serious grain of salt, since enunciation and environment could be the biggest contributor to the differences.
Re: (Score:2)
That's why the metric unit is not kilometers per liter, but the awkward liters per 100 km (l/100km). This makes it much more obvious - going from a SUV that guzzles 20l/100km t
Alexa... (Score:3)
Alexa, define 'begs the question".
It's all in the questions chosen (Score:2)
The questions listed are the types of questions these "assistants" are designed to answer. Go off the beaten path, and you get much worse results.
For example, ask:
"What street am I on?"
"What city am I in?"
"How many people are in my contact list?"
"How many miles did I travel yesterday?"
"When is my next dentist appointment?"
Re: (Score:2)