Report: Apple's AI and 'Siri' Efforts Hindered by Caution, Dysfunction (macrumors.com) 55
The Information reports:
Late last year, a trio of engineers who had just helped Apple modernize its search technology began working on the type of technology underlying ChatGPT... For Apple, there was only one problem: The engineers no longer worked there.
They'd left Apple last fall because "they believed Google was a better place to work on LLMs...according to two people familiar with their thinking... They're now working on Google's efforts to reduce the cost of training and improving the accuracy of LLMs and the products based on these models, according to one of those people."
MacRumors summarizes the article this way. "Siri and Apple's use of AI has been severely held back by caution and organizational dysfunction, according to over three dozen former Apple employees who spoke to The Information's Wayne Ma." The extensive paywalled report explains why former Apple employees who worked in the company's AI and machine learning groups believe that a lack of ambition and organizational dysfunction have hindered âOESiriâOE and the company's AI technologies. Apple's virtual assistant is apparently "widely derided" inside the company for its lack of functionality and minimal improvement over time. By 2018, the team working on âOESiriâOE had apparently "devolved into a mess, driven by petty turf battles between senior leaders and heated arguments over the direction of the assistant."
SiriâOE's leadership did not want to invest in building tools to analyse âOESiriâOE's usage and engineers lacked the ability to obtain basic details such as how many people were using the virtual assistant and how often they were doing so. The data that was obtained about âOESiriâOE coming from the data science and engineering team was simply not being used, with some former employees calling it "a waste of time and money..." Apple executives are said to have dismissed proposals to give âOESiriâOE the ability to conduct extended back-and-forth conversations, claiming that the feature would be difficult to control and gimmicky. Apple's uncompromising stance on privacy has also created challenges for enhancing âOESiriâOE, with the company pushing for more of the virtual assistant's functions to be performed on-device.
Cook and other senior executives requested changes to âOESiriâOE to prevent embarassing responses and the company prefers âOESiriâOE's responses to be pre-written by a team of around 20 writers, rather than AI-generated. There were also specific decisions to exclude information such as iPhone prices from âOESiriâOE to push users directly to Apple's website instead. âOESiriâOE engineers working on the feature that uses material from the web to answer questions clashed with the design team over how accurate the responses had to be in 2019. The design team demanded a near-perfect accuracy rate before the feature could be released. Engineers claim to have spent months persuading âOESiriâOE designers that not every one of its answers needed human verification, a limitation that made it impossible to scale up âOESiriâOE to answer the huge number of questions asked by users.
Similarly, Apple's design team repeatedly rejected the feature that enabled users to report a concern or issue with the content of a âOESiriâOE answer, preventing machine-learning engineers from understanding mistakes, because it wanted âOESiriâOE to appear "all-knowing."
They'd left Apple last fall because "they believed Google was a better place to work on LLMs...according to two people familiar with their thinking... They're now working on Google's efforts to reduce the cost of training and improving the accuracy of LLMs and the products based on these models, according to one of those people."
MacRumors summarizes the article this way. "Siri and Apple's use of AI has been severely held back by caution and organizational dysfunction, according to over three dozen former Apple employees who spoke to The Information's Wayne Ma." The extensive paywalled report explains why former Apple employees who worked in the company's AI and machine learning groups believe that a lack of ambition and organizational dysfunction have hindered âOESiriâOE and the company's AI technologies. Apple's virtual assistant is apparently "widely derided" inside the company for its lack of functionality and minimal improvement over time. By 2018, the team working on âOESiriâOE had apparently "devolved into a mess, driven by petty turf battles between senior leaders and heated arguments over the direction of the assistant."
SiriâOE's leadership did not want to invest in building tools to analyse âOESiriâOE's usage and engineers lacked the ability to obtain basic details such as how many people were using the virtual assistant and how often they were doing so. The data that was obtained about âOESiriâOE coming from the data science and engineering team was simply not being used, with some former employees calling it "a waste of time and money..." Apple executives are said to have dismissed proposals to give âOESiriâOE the ability to conduct extended back-and-forth conversations, claiming that the feature would be difficult to control and gimmicky. Apple's uncompromising stance on privacy has also created challenges for enhancing âOESiriâOE, with the company pushing for more of the virtual assistant's functions to be performed on-device.
Cook and other senior executives requested changes to âOESiriâOE to prevent embarassing responses and the company prefers âOESiriâOE's responses to be pre-written by a team of around 20 writers, rather than AI-generated. There were also specific decisions to exclude information such as iPhone prices from âOESiriâOE to push users directly to Apple's website instead. âOESiriâOE engineers working on the feature that uses material from the web to answer questions clashed with the design team over how accurate the responses had to be in 2019. The design team demanded a near-perfect accuracy rate before the feature could be released. Engineers claim to have spent months persuading âOESiriâOE designers that not every one of its answers needed human verification, a limitation that made it impossible to scale up âOESiriâOE to answer the huge number of questions asked by users.
Similarly, Apple's design team repeatedly rejected the feature that enabled users to report a concern or issue with the content of a âOESiriâOE answer, preventing machine-learning engineers from understanding mistakes, because it wanted âOESiriâOE to appear "all-knowing."
I thought "Dysfunction" was the name of the game? (Score:4, Informative)
Because that is really where AI is now: It can sort-of communicate, but the rest is basically citations (and short-distance derivative stuff) and hallucinations.
Re: (Score:3)
To say that modern LLMs produce 'a convincing illusion" is about as generous as I'm willing to be. Of course, Eliza also produced a 'convincing illusion'. It turns out we're easily fooled when we want to be.
I really don't like the term "hallucination" as it implies that far more is happening than is actually happening. Make no mistake, that term was selected specifically because it was misleading. "It tries to produce factual output, but it sometimes hallucinates facts" sounds a lot better than "it will
Re: (Score:2)
You are not wrong. I also agree that the term "hallucination" is basically a lie that tries to put a much better interpretation on what is actually happening: Namely that there is absolutely no understanding or insight in the box. A hallucination is something you can typically fix and it suggests that otherwise insight and understanding is at work. Of course that is not the case.
Re: (Score:2, Troll)
Re: (Score:2)
AI continues to evolve and will soon exceed human intelligence.
That's just delusional. It shows a fundamental lack of understanding that, at this point, can only be deliberate. You don't want to acknowledge the reality here. I can only speculate as to why, but if you don't want facts to get in the way of your fantasy, you should probably find a different forum. I hear that the LessWrong cult is always looking for credulous members, I'm sure you'd fit right in.
Re: I thought "Dysfunction" was the name of the ga (Score:3)
Re: I thought "Dysfunction" was the name of the ga (Score:5, Insightful)
It's not denial, it's just reality. AI is not 'evolving' in any meaningful sense of the word. GPT4 is "better" than GPT2, sure, but that's primarily because the model is significantly larger. There isn't anything fundamentally different about it, certainly nothing that would make it capable of the things you seem to think it can do.
You can pretend that larger models are all you need, but that's silly nonsense. As I've pointed out many times in the past, the gains you get from increasing the size of the model decrease exponentially. Doubling the size of the model does not double the performance. At this point, I doubt it would be possible to notice any improvements from a mere doubling. GPT-3 has ~175 billion parameters, GPT-4 has ~1 trillion -- it's around 6 times the size of GPT-3, yet it's only marginally better.
Oh, and as absurdly large as the latest models are, they still can't do simple arithmetic. What does that tell you?
I strongly recommend that you take some time to learn about how these kind of models work. It will give you a much more realistic understanding of their capabilities and limitations. I called your ridiculous claims delusional for a reason. These things simply aren't capable of the things you believe to be.
In my line of work, we're already using AI for code generation, testing, documentation, training, etc.
I sure you are. People waste time, money, and effort on stupid fads all the time. I suspect this one will last until people figure out that it doesn't actually improve productivity.
Thinking that these things will save you time and effort writing code is probably the funniest thing I've ever seen. I've tried, knowing full-well that it was a futile effort, because I'm open-minded. For my trouble, I now have a collection of some of the funnier failures that pointless exercise produced. It's amazing how much time I've seen people waste on that particular absurdity all while insisting, against all reason, that it was actually making them more productive.
If you want a real way to boost your productivity, stop chasing fads and trends.
Re: (Score:2)
Bullshit AI (Score:5, Interesting)
eh, ChatGPT et al are convincing bullshitters, which is quite impressive if not what AI should be targeting. More accurately ChatGPT is a confabulator [wikipedia.org] because it cannot tell if it is lying or not. It draws on stored information (memories) but is missing a lot and lacks understanding of context so it stitches together a story that may or may not have any congruence with reality.
Confabulation is often associated with diseases such as Alzheimer's, or when there has been memory loss, or as part of the expression of other pervasive mental disorders.
From the Turing Test pov, sure I think it is difficult to tell if I'm talking to a machine or a convincing con man but I wouldn't use this so called AI for anything critical.
The fact that it is commercially useful is food for thought.
Re: (Score:2)
"Confabulator" is a more accurate, agreed.
The fact that it is commercially useful is food for thought.
Well, given the quality of "tech support" often encountered, I am not that surprised. The natural language interface is obviously useful. But the world model is on the level of a somewhat advanced toy and more likely to cause problems than offer reasonable solutions.
Personally, I am waiting for the first attacks on "google-coders" using these things for writing their programs. All it takes is getting some somewhat subtly flawed code into the model that is not known
Re: (Score:2)
Whilst I am amazed at what ChatGPT manages to do, I agree that confabulation is a good description of it, although I kind of also like to call it regurgitation.
The bit about its passing Turin tests, I think it's just more reflection on how little thinking people do in real life, rather than anything about whether there's a real sentience involved.
To give an example, if someone happened to ask me, hey what do you think about the situation in Sudan? It would be quite easy for me to just recall whatever I happ
Re: (Score:2)
I like it.
"ChatGROK"
Copyright that you should.
Caution is good... (Score:1)
... dysfunction, not so much.
I for one, welcome a long delay before bowing down to our new Cupertino-based robot^H^H^H^H^HAI Overlords.
âOE (Score:3)
and Siri is not smart enough to realize that /. Doesn't take Apple specific characters like âOE
Re: (Score:2)
Nah - âOESiriâOE is the internal code name.
Re:âOE (Score:5, Insightful)
and Siri is not smart enough to realize that /. Doesn't take Apple specific characters like âOE
Wrong. You've got it backwards.
Slashdot isn't smart enough to realize that Unicode has been an accepted standard for decades.
Re:âOE (Score:4, Informative)
Actually, the slash engine has been fully Unicode compliant for decades now.
What the site admins did was put in a whitelist of allowed Unicode characters, basically the ASCII set by stripping the high bit off, leaving the mojibake of what happens.
The reason for this is Unicode is poison if you don't sanitize it - there are numerous control codes that will screw with the site layout (e.g., RTL and LTR overrides) as well as decorations that can be used to render a whole page black. Because we know it's hard to parse and handle sanely, given the number of Unicode hacks out there. And Unicode keeps changing with more codes added every year, so blacklists would need routine updating. The admins hated doing that so implemented a whitelist instead.
Re: (Score:2)
Actually, the slash engine has been fully Unicode compliant for decades now.
What the site admins did was put in a whitelist of allowed Unicode characters, basically the ASCII set by stripping the high bit off, leaving the mojibake of what happens.
The reason for this is Unicode is poison if you don't sanitize it - there are numerous control codes that will screw with the site layout (e.g., RTL and LTR overrides) as well as decorations that can be used to render a whole page black. Because we know it's hard to parse and handle sanely, given the number of Unicode hacks out there. And Unicode keeps changing with more codes added every year, so blacklists would need routine updating. The admins hated doing that so implemented a whitelist instead.
First, thank you for a most considered, informative and erudite Response!
One thing though: How is it that all the other Forums I visit don't seem to have any problems with little johnny droptable; and yet evince a wide gauntlet of Unicode support, styled text, and more.
What Secret Sanitizer Sauce do they have that Slashdot web coders don't? Serious question..
Accurate assistant (Score:5, Informative)
So they want to make an AI assistant u that doesn't invent it's answers and does not create nonsensical results?
Sounds good to me. Too bad this will be never released on time...
Re: (Score:2)
Probably cannot be done at this time at all. Fact-checking is way harder than regurgitating things and creating random fantasies.
Re: Accurate assistant (Score:2)
Re: (Score:2)
General population? Yes. Probably the reason why they all (mistakenly) believe this is a really fundamental breakthrough.
Experts? No. Something like Wolfram Alpha is impressive in its very limited domain. ChatGPT is not. Well, the natural language interface is a bit impressive but that is it.
Re:Accurate assistant (Score:5, Insightful)
Re: (Score:2)
Oddly enough, I think Apple made the right call here. Even with all of their careful curation and testing, they've still had a few embarrassing moments. Given how modern LLMs work, it's absolutely astonishing that the things ever output anything other than complete nonsense.
Re: (Score:2)
So they want to make an AI assistant u that doesn't invent it's answers and does not create nonsensical results?
When are they going to make that, then?
Have you ever tried using Siri for things past searching the web? Because it's terrible. One of the most hilarious things is when you ask for directions to some place, Siri triggers the generic map response, which causes it to ask you if you want directions there or to call it.
I remember asking it for directions to my father's house and it ended up somehow looking up his name, and then searching for places with his name in it, rather than using the address associated w
Re: (Score:1)
Too bad this will be never released on time...
Or ever. AI is trained on material prepared and created by humans. You can't expect it to perform better than humans who give equally nonsensical results.
What is âOESiriâOE (Score:4, Insightful)
Re: What is âOESiriâOE (Score:2)
I think it's calling Siri an a**hole in polite technical jargon, in a public forum
Siri and Alexa are both shit because corporations (Score:1)
Corporations' incentives cause apps like these virtual agents to dwindle into shit because their KPIs are about shipping whiz-bang new features and (half-assed / unfinished) total replacements rather than investing in making what exists actually better. No one ever got a raise for refactoring a shitty codebase.
Generative AI / IFTTT IoT agent startup would win (Score:1)
(Related to my other comment.) This is how one would make a decent personal virtual agent because the constraints of corporations don't welcome it.
And so it begins (Score:1)
These are the first signs due to the absence of Jobs. Hell, even with Jobs they might have headed down this path and booted him out again.
Apple's success has always been a balance. Jobs' crazy vision vs Wozniak's stability. Then when Apple got big it was Jobs vs everyone else and this didn't work, resulting in Jobs getting booted out of the company. Apple then sunk in to a pit of horrible ideas and they were or the verge of bankruptcy. Jobs came back and brought a slew of new ideas. This time there was more
3rd Party Integrations (Score:2)
Re: (Score:2)
Re: (Score:2)
Apple can literally afford to make everything in house. For example, they even design their own processors. And often, by doing it in house, they do it better. For example, their processors are really good.
Being able to afford it and doing it well are two different things. Apple has been doing custom silicon pretty consistently since its founding, and custom CPUs are a natural extension of that. They have almost always done it well.
One thing that Apple has never done well is web services. The current offerings are better than previous ones, but that's not a high bar, given how bad iSync was. The iTunes Store is remarkable for a site that I'm assuming is still built on top of WebObjects, but at last check
Re: (Score:2)
Apple has been doing custom silicon pretty consistently since its founding, and custom CPUs are a natural extension of that. They have almost always done it well.
(Just an off-topic historical tidbit here - I do not disagree with any of your main points)
Roughly a decade after their founding, once Woz was no longer head engineer of their leading products starting with the Macintosh.
Woz was a *huge* open/standards hardware proponent. The Apple 1 and 2 ("][" and "][+") were completely off the shelf components.
Thanks. I completely forgot about the pre-Mac hardware (which is particularly bad, because I own an Apple IIgs).
The //gs was probably the first candidate for truly custom silicon. (The Apple /// is the only system prior to the GS where I can't say for sure)
The Apple III does not. But you missed a couple of custom ASICs [applelogic.org].
The MMU and IO chips in the Apple IIe were at least arguably their first custom ASICs, I think. Previous models used a bunch of discrete TTL logic. Those chips wrapped those bits into a single chip. The MMU consolidated the memory management hardware, and the IOU consolidated the discrete logic for the display (with two versions,
Re: 3rd Party Integrations (Score:2)
Re: (Score:2)
The reason these kind of problems start is ... (Score:1)
These are exactly the kind of problems most corporations have. Apple didn't used to have them because it had an actual qualified technologist at the helm. But since Steve Jobs died, the bean counters are in charge and -just like with every other large corporation besides Tesla & SpaceX, once they take over, nobody high enough in the company to get stuff done understands well enough to get anything done. Next Step: Mediocracy. Sigh. It's not Tim Cook's fault, he's just not technically qualified.
Re: (Score:2)
Neither was Steve Jobs. He was not a programmer or engineer; he was an ideas guy.
Cloud Or DIe (Score:1)
Re: Cloud Or DIe (Score:2)
I wonder if (Score:2)
some of the problems companies and the CCP have been having with their AI models is keeping them properly politically correct and in line with the company's politics.
{O.O}
Copeland redux (Score:1)
The same Google that gave us Bard? (Score:2)
So let me get this straight. A bunch of Apple engineers left for Google because they felt Apple wasn't doing anything big with AI.
And this was the same Google who after seeing ChatGPT, went nuts and gave us Bard?
Maybe there's a reason why Apple is taking things slow. Apple is almost never the first to do anything, but usually when they do something it comes out nice and polished. I would probably believe Apple is just sitting back and seeing where ChatGPT and the others are heading before actually deciding
Re: The same Google that gave us Bard? (Score:2)