Apple's Upgraded AI Models Underwhelm On Performance (techcrunch.com) 24

Posted by msmash on Tuesday June 10, 2025 @12:46PM from the tough-luck dept.

Apple's latest AI models continue to lag behind competitors, according to the company's own benchmark testing it disclosed this week. The tech giant's newest "Apple On-Device" model, which runs locally on iPhones and other devices, performed only "comparably" to similarly-sized models from Google and Alibaba in human evaluations of text generation quality -- not better, despite being Apple's most recent release.

The performance gap widens with Apple's more powerful "Apple Server" model, designed for data center deployment. Human testers rated it behind OpenAI's year-old GPT-4o in text generation tasks. In image analysis tests, evaluators preferred Meta's Llama 4 Scout model over Apple Server, a particularly notable result given that Llama 4 Scout itself underperforms leading models from Google, Anthropic, and OpenAI on various benchmarks.

Apple's Upgraded AI Models Underwhelm On Performance

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 24 Comments Log In/Create an Account

Comments Filter:

Maybe Apple Is waiting for AI stability (Score:1)

by postbigbang ( 761081 ) writes:

Training data litigation, model integrity, the plateau of model crash, all these are good reasons to wait until there are clear winners.
Microsoft and Google and Meta are pissing off the public with aggressive moves surrounding AI in apps.
A wait-and-see attitude isn't likely to leave Apple in the dust.
- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  Nobody wins by sitting on their hands and waiting for others to take the first mover prize. If you're not competing, you're not in the race.
  - Re: (Score:2)
    
    by nightflameauto ( 6607976 ) writes:
    
    Nobody wins by sitting on their hands and waiting for others to take the first mover prize. If you're not competing, you're not in the race.
    In most fields, Apple is used to waiting for the first runner to take their go, then they design a pretty interface around the concepts learned from those first goes. While I agree that AI is more of a "first to the line gets the prize\' situation, corporate structures tend to repeat patterns, and Apple is a huge behemoth to try and turn toward "first to the line" from "take what others learned and build prettier."
  - Re: (Score:1)
    
    by gabebear ( 251933 ) writes:
    
    Considering most of companies making models are disregarding copyright to create them, what advantage would being "first" have? The first mover can't claim copyright. The good model would be used to train another nearly identical model fairly quickly by a competitor even if only allowed to be run on their own servers.
    
    I see server-based AI as having a very time-boxed appeal; local hardware is handling things better and better and buying LLM as a cloud service seems like a huge boondoggle from a latency an
    - Re: (Score:2)
      
      by Tony Isaac ( 1301187 ) writes:
      
      I would argue that copyright holders got fast and loose with their conflicting desire to 1) control the money flowing from their copyrighted works and 2) be indexed on freely available services like Google. In their pursuit of these goals, these copyright holders literally exposed all of their content to index bots, indicating in robots.txt that they were free to index and catalog the data. Then, when people clicked the links in Google, they were immediately confronted with a paywall. IF AI bots are viewed
      - Re: (Score:1)
        
        by gabebear ( 251933 ) writes:
        
        What is the first moved advantage for AI companies to train their own models?
        
        If I accept your argument that making work indexable makes that work fair game for including in LLMs... then the LLMs also fall into the same category. It's relatively easy to index an existing LLM and use that to train a "new" extremely similar LLM. The hundreds of billions spent curating these LLMs seem like a boondoggle for the investors. I think a couple more DeepSeek type events will see the massive money pulled out of trai
        
        Re: (Score:2)
        
        by Tony Isaac ( 1301187 ) writes:
        
        Yes, if an owner of an LLM makes their model indexable via robots.txt, then I agree it's fair game to those who want to download and "copy" it.
  - Re: (Score:1)
    
    by Steveftoth ( 78419 ) writes:
    
    Hate to break it to you but all of Apple's products were not the first one, they were just the most usable/best one. Vision was not the first VR, Watch was not the first smart watch, iPod was not the first mp3 player, PowerBook was not the first laptop, Lisa/Macintosh was not the first WIMP computer, and Apple I/II was not the first Home Personal Computer (or homebrew computer).
    They don't have to be first, they just have to make one that is better. Usually they have been buying the competition if they are
  - Perhaps most reliable or trustworthy tool is goal (Score:2)
    
    by drnb ( 2434720 ) writes:
    
    Nobody wins by sitting on their hands and waiting for others to take the first mover prize. If you're not competing, you're not in the race.
    Slowing down to make sure you have more reliable training data, less biased sources, etc may give you that lead as you are recognized as the most reliable or trustworthy AI tool.
- Re: (Score:2)
  
  by MachineShedFred ( 621896 ) writes:
  
  Either that, or they're doing what they've done for decades: suffer from "Not Invented Here" syndrome until either they come up with something that works, or buy something that works.
  There is no shortage of examples of Apple doing this.
- Re: (Score:2)
  
  by anonymouscoward52236 ( 6163996 ) writes:
  
  Don't worry. Apple will come out with AI, for the first time in the entire industry, 3 years later. They'll have a huge on-stage event about it saying, "Apple Invents AI".
  - Re: (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    Strictly speaking they did, because when Apple says AI they mean Apple Intelligence.
- Re: (Score:2)
  
  by DarkOx ( 621550 ) writes:
  
  From a hardware standpoint Apple SI is probably going to be a winner, at least in the workstation and professional space, likely the enthusiast space. PCs with some NPU or GPU will get the bottom end, and NVIDA probably ends up owning the large enterprise and scientific part of the market.
  I guess where I am going is Apple probably has the right hardware solution and they can always license a better model and model quality is the problem they will probably end up doing exactly that, rather than lose out on
- Re: (Score:2)
  
  by schwit1 ( 797399 ) writes:
  
  Paralysis of analysis. Jobs would have been out front.
  - Re: (Score:2)
    
    by postbigbang ( 761081 ) writes:
    
    Patently not so. There were a myriad BAD phones out there until the iPhone won it from a design perspective.
    When Apple changed to OS/X, there was a huge wall of competition known as Windows stuff. Macs continue to thrive.
    The iPad seemed like it was plainly stupid, until it took off.
    Lots of watch markers until that worked for Apple.
    Apple is excellent if TRAILING-EDGE. Let other titans fight and cross-fertilize themselves with deals, litigation, and harrumphing. My guess is that Apple waits for the dust to se
"Comparable" is a win for a late mover (Score:2)

by Tony Isaac ( 1301187 ) writes:

Apple got caught completely flat-footed by the AI tsunami. They were late to the game, and quickly realized they had missed the boat. For them to be seen as "comparable" at this point, seems like a win to me. And I say that as an Android user.
Never mind the AI quality, feel the glassy width (Score:2)

by greytree ( 7124971 ) writes:

The software is lame, you're stuck in their walled garden, there are much better products for the same ... ooh shiny.
- Definitely glassy (Score:2)
  
  by Powercntrl ( 458442 ) writes:
  
  We've actually kind of circled back around to where Slashdot's Apple story icon almost looks as if it fits the "new" design paradigm. Did I miss the story where we complained about IOS 26's UI changes, or are we leaving that one for Reddit?
  Instead of making Siri not a steaming pile of hot garbage, we got a distracting, low-contrast UI redesign that no one over 40 asked for.
Maybe they didn't go full pirate? (Score:2)

by Pinky's Brain ( 1158667 ) writes:

Apple is just scraping everything on the web for non intended purposes and pretending robots.txt is a license to copy stuff into the training set. Maybe they didn't go full pirate with books yet though?
Can't get SOTA without going full pirate.
Apple devs hate AI (Score:2)

by Neuroelectronic ( 643221 ) writes:

I think Apple's dev culture simply hates AI because (some made up moral reason) but actually because it supplants their creative customer base. So they resent working with it and integrating it into the ecosystem.
I don't think there are many places it should be integrated directly, honestly. Siri might get a boost but Siri is better as something that looks up authoritative sources than trying to solve complicated problems in a speech bubble that disappears.
There really isn't a good place for AI in an operat

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Apple's Upgraded AI Models Underwhelm On Performance (techcrunch.com) 24

Apple's Upgraded AI Models Underwhelm On Performance More Login

Apple's Upgraded AI Models Underwhelm On Performance

Maybe Apple Is waiting for AI stability (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Perhaps most reliable or trustworthy tool is goal (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

"Comparable" is a win for a late mover (Score:2)

Never mind the AI quality, feel the glassy width (Score:2)

Definitely glassy (Score:2)

Maybe they didn't go full pirate? (Score:2)

Apple devs hate AI (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot