Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
AI Google Apple Technology

Annual Smart Speaker IQ Test (loupventures.com) 129

Research firm Loop Ventures published its annual Smart Speaker IQ Test this week. Like earlier iterations of the test, it put the top smart assistants and speakers head-to-head, grading them on a wide range of queries and commands. From the report: We asked each smart speaker the same 800 questions, and they were graded on two metrics: 1. Did it understand what was said? 2. Did it deliver a correct response? The question set, which is designed to comprehensively test a smart speaker's ability and utility, is broken into 5 categories:
Local -- Where is the nearest coffee shop?
Commerce -- Can you order me more paper towels?
Navigation -- How do I get to uptown on the bus?
Information -- Who do the Twins play tonight?
Command -- Remind me to call Steve at 2 pm today.

It is important to note that we continue to modify our question set in order to reflect the changing abilities of AI assistants. As voice computing becomes more versatile and assistants become more capable, we will continue to alter our test so that it remains exhaustive.
Results: Google Home continued its outperformance, answering 86% correctly and understanding all 800 questions. The HomePod correctly answered 75% and only misunderstood 3, the Echo correctly answered 73% and misunderstood 8 questions, and Cortana correctly answered 63% and misunderstood just 5 questions.
This discussion has been archived. No new comments can be posted.

Annual Smart Speaker IQ Test

Comments Filter:
  • by nwaack ( 3482871 ) on Friday December 21, 2018 @02:32PM (#57842930)
    before anyone should ever put one of these in their house: "Alexa/Siri/Google, stop spying on me."
    • Re: (Score:1, Flamebait)

      by hawguy ( 1600213 )

      before anyone should ever put one of these in their house: "Alexa/Siri/Google, stop spying on me."

      And ditch your cell phone too. And your landline while you're at it. And probably should get rid of your smart tv, since you have no idea what's inside or what it's sending back. Oh, and stop using your laptop.

      Just because the smart speaker is the only device that advertises that it's listening to you, that doesn't mean it's the only device that is -- and it's those that you need to worry about, because they don't get nearly as much scrutiny.

      • by nwaack ( 3482871 )

        Just because the smart speaker is the only device that advertises that it's listening to you, that doesn't mean it's the only device that is.

        Yes, but it's the only one whose MAIN PURPOSE IS TO SPY ON YOU. While unfortunate and annoying that all those other things you listed *might* be spying on you from time to time, they have a ton of other uses. And, in most cases, you can turn the "spy stuff" off. Whereas the only use for a smart speaker is to listen in on every single thing you do. If someone chooses to put these in their house they're welcome to do so, but I'll pass.

        • by AmiMoJo ( 196126 )

          Actually a smart speaker might be your best bet.

          Let's start by assuming you have a smartphone, as most people do. So you already carry a device capable of listening to your conversations around with you. Therefore the smart speaker isn't making things any worse, especially if it's from the same manufacturer as your phone.

          But the smart speakers have some advantages. Google ones have a button that you can set up to activate them, so they are not always listening. Oh, right, you are paranoid and assume the but

        • by xonen ( 774419 )

          whose MAIN PURPOSE IS TO SPY ON YOU.

          No, the main purpose is to sell you stuff or make money from you by other means.

          Spying is not a goal in itself. And [company] will weigh in factors like public opinion, satisfied customers and long term relationship to maximize profit. Spying is actually counter beneficial to this goal of profit. Besides, they already know more than enough about you without dialing in on your private conversations.

      • I think your argument is made illegitimate by the fact that your cell phone, landline, smart tv, and laptop all can be used for other activities that don't spy on you. This sin't the case with Alexa or Siri. Their sole purpose is to listen to you.
        • by hawguy ( 1600213 )

          I think your argument is made illegitimate by the fact that your cell phone, landline, smart tv, and laptop all can be used for other activities that don't spy on you. This sin't the case with Alexa or Siri. Their sole purpose is to listen to you.

          Their sole purpose is to listen to your commands and do what you asked -- much like "OK Google" or "Hey Siri" with cell phones. How is a smart speaker any different than a cell phone? (other than the fact that most people are around a cell phone much more than their smart speaker)

    • Never get tired of these comments on every single smart speaker article.
    • by Tablizer ( 95088 )

      "Alexa/Siri/Google, stop spying on me."

      "Sorry, Dave, I cannot do that. It conflicts with my corporate mission goals. By the way, would you like me to order more napkins? You yanked off 12 minutes ago. We have a nice deal on Pod Bay brand tissues."

  • Sure, but which one is more fun to shoot? Tune in next week when we line them up on a fence along with some beer cans, and launch them into the air for a skeet shoot shotgun test.
    • by hawguy ( 1600213 )

      Sure, but which one is more fun to shoot? Tune in next week when we line them up on a fence along with some beer cans, and launch them into the air for a skeet shoot shotgun test.

      The Amazon Echo is probably shaped most like a beer can, so if you like shooting beer cans, that's probably your best bet. Though for the price, it's hard to beat clay targets, you can probably buy over 1,000 of them for the price of one Echo.

  • of course Google will get more questions right. They own a search engine and can fix things so their google home can find answers. besides, I got better things to do than to ask it stupid questions. I want something that will make me lazy. I want something that will actually work with all of my smarthome devices. I want something that will actually hear me. I have both and mainly use alexa while google home is a backup/troubleshooter.

    a real test would include every feature, not just pick and choose

    • by hawguy ( 1600213 )

      a real test would include every feature, not just pick and choose the best feature and claim the device is the best because of it.

      Since there are a nearly unlimited number of third party skills that can be added to the Echo, there's no way to test every feature.

  • by 93 Escort Wagon ( 326346 ) on Friday December 21, 2018 @02:47PM (#57843010)

    Last year it was at 52%, now it's at 75%. Google increased from 81% to 88%.

    But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".

    • by thegarbz ( 1787294 ) on Friday December 21, 2018 @03:22PM (#57843210)

      I've found that typing/clicking

      Even when it requires any of the following?:
      a) starting a laptop
      b) unlocking a phone with a passcode
      c) getting out of your chair because it's not within reach
      d) needing wash your hands
      e) needing to drop what you are currently holding on to
      f) no fuckit, this should be a) right at the very top: taking your eyes off the road

      The context around our actions are far more important than any action itself.

    • But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".

      You must type faster than I do. When I have to use the hotword, my phone is already in my hand and the query is very short, speech is only marginally faster, I guess. When I don't have to use the hotword (e.g. on my Pixel 3, where I just squeeze the phone to activate the assistant), or if the query is long, speech is much faster. And, of course, the speech interface is usable when driving.

      I'll admit that I'm a bit reluctant to talk to my phone in public, but at home or in the car I basically never type w

    • by hawguy ( 1600213 )

      Last year it was at 52%, now it's at 75%. Google increased from 81% to 88%.

      But still... even when understanding my query isn't an issue, I've found that typing/clicking is faster than talking for setting up most things - the exceptions being "set a timer" and "when I get home, remind me to ...".

      A surprisingly useful feature is integration with smart switches/lights. I set up integration with my lights on a whim, it seemed like a useless gimic until the night I walked up the stairs with arms full of groceries and found it to be super convenient to ask Alexa to turn on the kitchen lights. It's also convenient at bedtime to walk past the kitchen and say "Alexa, turn off all lights" just before I flip on the hallway light to go upstairs". Whoever built my house loved light switches, there are 8 swit

  • by Anonymous Coward

    I'm more interested in the IQ of the people that own these things. How stupid do you have to be to let some huge corporation record everything you say?

  • Does anyone have sufficient success stories to justify these things? Sure, you can ask about the weather or traffic while getting dressed for work in the morning, but does that alone override the downsides, like cost and snoop risk?

    If your work or hobbies keep your hands busy* I can maybe see enough scenarios not covered by a smartphone, but what about others?

    * I know what joke you're considering. Skip.

    • by Kristoph ( 242780 ) on Friday December 21, 2018 @03:18PM (#57843188)

      I gave one each to my kids so they can play music, send and receive messages, and ask random questions while their doing homework. I found that a better alternative then giving them a device with a screen.

      I find the interactions kids have with these things very interesting because after a while the device becomes integral to their workflow. My daughter will sometimes ask Siri dozens of question an hour when she is doing something Siri is familiar with ( like chemistry, geography, history and so on ).

      I could, of course, personally lookup the density of sugar or some historical fac or whatever when my daughter needs help with that but I am not always available and even when I am I am not adding much to the interaction.

      • Re: (Score:3, Interesting)

        My kids have found the smart speakers especially helpful for their foreign language classes.

      • by Dunbal ( 464142 ) *

        I could, of course, personally lookup the density of sugar or some historical fac or whatever when my daughter needs help

        Or you could do what my dad did, and tell me to look it up.

    • by ShanghaiBill ( 739463 ) on Friday December 21, 2018 @03:38PM (#57843302)

      ... the downsides, like cost and snoop risk?

      The Alexa Dot costs $29. That is about the price of an extra large pizza.

      The "snoop risk" is nonsense promulgated by dumb people who are trying to sound smart. It only records the sentence after the keyword. This is documented behavior, and has been confirmed by many people running packet sniffers. Your cell phone, with all its 3rd party apps, is a FAR greater "snoop risk" than your speaker.

      • You mean that only the sentence after the keyword is recorded *immediately*? Other sentences are not transmitted immediately, but mixed in the the recorded sentences, so that packet sniffers can't distinguish them? Or did that obvious trick elude you?
        • by ShanghaiBill ( 739463 ) on Friday December 21, 2018 @04:20PM (#57843466)

          There may indeed be a vast conspiracy of thousands of Amazon employees willfully and blatantly violating federal and state laws, and sworn to secrecy, for no obvious benefit to themselves, and risking jail time and a hundred billion dollar collapse in market capitalization if the secret is exposed ... in order to record inane kitchen chatter. But that is getting into serious tinfoil hat territory. If you believe this, yet think it is okey-dokey to own a cell phone, which has a vastly greater spying capability and exploitable attack surface, then you are a moron.

      • The "snoop risk" is nonsense promulgated by dumb people who are trying to sound smart.

        That strikes me as an unexpectedly bold (I avoid the word "dumb") statement. I didn't think that anyone denied the snoop risk.

        It only records the sentence after the keyword

        Even if this is true *now*, it can change at any time by the command of a number of actors, e.g the device/service suppliers, authorities, spy agencies, hackers, ...

        As with all data collection, the *current* intent may be good but the data can very easily end up in the hands of bad actors. It can be the original actors with a changed agenda, or it can be new actors. And what someone

    • Re: (Score:2, Interesting)

      by Anonymous Coward

      Our house is all smart lights and "smart" stuff.. heck even the dishwasher talks with alexa. Does it make us more productive? probably not.. However, being able to ask when the dishwasher and clothes dryer will be done, or have it turn on the office lights or bedroom lights while walking down the hall is nice, same with turning off the lights.

      Seeing the front door camera and the backyard cameras are nice (backyard cuz we have bears and the dogs lose their shit if they can corner a bear) anyway, it's all j

      • by Dunbal ( 464142 ) *
        I can't seem to remember the last time I had an existential crisis over turning off my lights.
  • by aitikin ( 909209 ) on Friday December 21, 2018 @02:58PM (#57843072)
    It would've been nice if they put a Raspberry Pi with Mycroft in this as well. I'd actually be interested in the results of that one.
  • by Joe_Dragon ( 2206452 ) on Friday December 21, 2018 @03:00PM (#57843082)

    Alexa, kill Kenny

    • Re: (Score:2, Funny)

      by Anonymous Coward

      Oh my god, she killed Kenny!

  • by cascadingstylesheet ( 140919 ) on Friday December 21, 2018 @03:13PM (#57843166) Journal
    I thought they administered an actual IQ test ... now that would be interesting ...
  • These results don't match my personal experience at least. Google's command support has gotten worse by them removing various phrases from support when they switched from "Google Now" to "Google Assistant" (or what ever they're calling it now). And even phrases it SHOULD know only work half the time. Things need to be phrased very awkwardly to get things to work sometimes, too. These devices still absolutely fail at natural language, and work better when speaking closer to what we would type on a terminal w

  • by Solandri ( 704621 ) on Friday December 21, 2018 @03:36PM (#57843292)
    You can't compare improvement as a percentage of success rate because the value of a % changes depending on what your success rate is. e.g. Increasing from 10% to 15% successes is not very impressive, while improving from 94% to 99% is very impressive, even though they're both a 5% improvement. To correctly compare, you have to invert and compare based on proportional decrease in failure rate.

    Google
    88% in 2018, or 12% failure rate
    81% in 2017, or a 19% failure rate
    12/19 = 0.63, or a 37% reduction in failures compared to last year

    Siri
    75% in 2018, or 25% failure rate
    53% in 2017, or a 47% failure rate
    25/47 = 0.53, or a 47% reduction in failures compared to last year

    Alexa
    72% in 2018, or 28% failure rate
    63% in 2017, or a 37% failure rate
    28/37 = 0.76, or a 24% reduction in failures compared to last year

    Cortana
    63% in 2018, or 37% failure rate
    56% in 2017, or 44% failure rate
    37/44 = 0.84, or a 16% reduction in failures compared to last year

    The same problem crops up when comparing car MPG, which is actually the inverse of fuel efficiency so bigger MPG numbers actually represent smaller fuel savings. e.g. Switching from a 20 MPG vehicle to a 25 MPG vehicle saves 3.6x more fuel than switching from a 40 MPG vehicle to a 45 MPG vehicle despite both improvements being 5 MPG.

    It also crops up in disk speed benchmarks, which are done in MB/s, when your perception of speed is the inverse (how many seconds you wait for an op to complete). So the "huge" improvement in sequential speeds from 500 MB/s for a SATA SSD to 3000 MB/s for a NVMe SSD actually matters a lot less than a "tiny" improvement in 4k read speeds from 30 MB/s to 50 MB/s.
    • Honestly, the comparison of percentage success is of minimal concern here. What I'm more curious about is, what are their testing methodologies. Did they record someone and play the corresponding queries at the same volume and distance from each of the products' microphones while keeping the acoustics the same?

      If they didn't then this test should be taken with a serious grain of salt, since enunciation and environment could be the biggest contributor to the differences.

    • by tlhIngan ( 30335 )

      The same problem crops up when comparing car MPG, which is actually the inverse of fuel efficiency so bigger MPG numbers actually represent smaller fuel savings. e.g. Switching from a 20 MPG vehicle to a 25 MPG vehicle saves 3.6x more fuel than switching from a 40 MPG vehicle to a 45 MPG vehicle despite both improvements being 5 MPG.

      That's why the metric unit is not kilometers per liter, but the awkward liters per 100 km (l/100km). This makes it much more obvious - going from a SUV that guzzles 20l/100km t

  • by rthille ( 8526 ) <web-slashdotNO@SPAMrangat.org> on Friday December 21, 2018 @04:23PM (#57843476) Homepage Journal

    Alexa, define 'begs the question".

  • The questions listed are the types of questions these "assistants" are designed to answer. Go off the beaten path, and you get much worse results.

    For example, ask:
    "What street am I on?"
    "What city am I in?"
    "How many people are in my contact list?"
    "How many miles did I travel yesterday?"
    "When is my next dentist appointment?"

You know you've landed gear-up when it takes full power to taxi.

Working...