


Apple To Analyze User Data on Devices To Bolster AI Technology 34
Apple will begin analyzing data on customers' devices in a bid to improve its AI platform, a move designed to safeguard user information while still helping it catch up with AI rivals. From a report: Today, Apple typically trains AI models using synthetic data -- information that's meant to mimic real-world inputs without any personal details. But that synthetic information isn't always representative of actual customer data, making it harder for its AI systems to work properly.
The new approach will address that problem while ensuring that user data remains on customers' devices and isn't directly used to train AI models. The idea is to help Apple catch up with competitors such as OpenAI and Alphabet, which have fewer privacy restrictions. The technology works like this: It takes the synthetic data that Apple has created and compares it to a recent sample of user emails within the iPhone, iPad and Mac email app. By using actual emails to check the fake inputs, Apple can then determine which items within its synthetic dataset are most in line with real-world messages.
The new approach will address that problem while ensuring that user data remains on customers' devices and isn't directly used to train AI models. The idea is to help Apple catch up with competitors such as OpenAI and Alphabet, which have fewer privacy restrictions. The technology works like this: It takes the synthetic data that Apple has created and compares it to a recent sample of user emails within the iPhone, iPad and Mac email app. By using actual emails to check the fake inputs, Apple can then determine which items within its synthetic dataset are most in line with real-world messages.
Opt-in then forced opt-in again later? (Score:5, Interesting)
So, with Apple's tricks [macintouch.com], will this be an optional "opt-in", then a "whoops, you got opt-in automatically without notice" with every subsequent updates?
Re: (Score:2, Interesting)
Of course. Nobody would opt-in to this bullshit but Apple's feet are on the coals so they, like every business, must squeeze the lifeblood out of every customer.
Need don't tamper or profile regulations (Score:2)
Really, really we need some regulator to get to the point where
"Big tech wants common carrier liability waivers,right? Then, OK, you have to ensure that email, text messages, photos, Pii on the devices, online services is not used to train any AI models in original form, modified form or anonymized form"
"And, note, this includes biometrics too, including typing speed, typing cadence, mouse movement paths,speed, click rates, etc."
Re: (Score:2)
Opt-in, and that preference sticking, is critical. I'm all for Apple taking a different approach from the "if you haven't completely prevented me from getting your data it is mine to use however I want" style generally in use.
I wonder how much of the phone resources/battery life this on device processing will use.
Re: (Score:2)
I wonder how much of the phone resources/battery life this on device processing will use.
That'd be my main reason to disable it - their entire software stack is already so bloated, buggy and slow, it makes using even a 2-year old device 'un-delightful' already. This would only make it worse.
No thanks (Score:2)
Yet more reasons to stick with our 13 and 14 Pros
Re: (Score:2)
12, 11, and older.
Re: All of the AI companies will start doing this (Score:1)
Interestingly, this kind of tech is an excellent response to Chinaâ(TM)s nosiness. Other smartphone OSes, and their related services in China, are an open book to the government. With the process described in the link, the government could discover that some X percentage of users are messaging on topics the government deems taboo. However they wonâ(TM)t know who is doing that.
Itâ(TM)s possible China will ban this technology all together, as it wonâ(TM)t provide the level of surveillance
This will not safeguard private data (Score:2)
An AI training program (which is presumably owned by Apple and licensed for your use on the iPhone) wouldn't be authorized to read these emails and will definitely not be authorized to arbitrarily act up on them.
In particular, comparing the contents of co
Not copying your data to synthetics ... (Score:3)
In particular, comparing the contents of confidential emails against a synthetically produced external set of contents, so as to favour the more relevant, synthetically produced, samples, is a form of exfiltration of the data.
Not really, there is no exfiltration. They are not copying anything of yours to their synthetics or anywhere else. What they are doing is ranking their synthetics for a match to your data. Anything matching your data was preexisting data in their synthetics. You can't exfiltrate something they already have.
What you can do is confirm a guess when actual data matches preexisting synthetic data, as in your hangman letters example. Calling this exfiltration is a little misleading, its confirmation. So the ri
Re: (Score:2)
The ranking is enough. You use the synthetics as a basis, and look for combinations of synthetics that recreate the unknown data. This falls within the purview of latent semantic analysis, and what is kn
Personally Identifiable Information is key here (Score:2)
Yes, "guesswork" can be powerful, see WW2 Bletchley Park. They weren't always dealing with specific detailed orders to units. Sometimes it was something vague like a person i
Re: (Score:2)
Not really, there is no exfiltration. They are not copying anything of yours to their synthetics or anywhere else. What they are doing is ranking their synthetics for a match to your data. Anything matching your data was preexisting data in their synthetics. You can't exfiltrate something they already have.
This is like arguing oracle attacks can't be used to decrypt ciphertexts. They are not copying... they are deriving... a distinction without a difference.
What you can do is confirm a guess when actual data matches preexisting synthetic data, as in your hangman letters example. Calling this exfiltration is a little misleading, its confirmation. So the risk here is one of Personally Identifiable Information (PII). Is this match recorded alongside your PII? Or is this match recorded without any PII, such that Apple could not connect it to a user even if they wanted to?
No this is actually exfiltrating data. No corporation would interpret unauthorized external queries "ranking synthetics" to their data as anything other than an attack because it is.
Re: (Score:2)
Can confirming guesses be powerful, yes, see my other response and my Bletchley Park example. Also note that without the ability to connect the guesswork to Personally Identifiable information there is little utility beyond training an AI.
Personally Identifiable Information is the key issue and Apple needs to clarify things.
Re: (Score:2)
For example, a percentage of email communications that I receive includes confidentiality disclaimers in the footers, including legal requirements to not distribute the email to non-recipients, etc. I'm sure other slashdotters are in the same situation.
Those are nonsensical, and roughly as binding as the following paragraphs:
You are not authorized to read the paragraph above this one. If you read it anyway, you owe me $500 and need to turn yourself in to the nearest FBI headquarters to begin serving 6 months jail time in a federal minimum security prison.
In addition, responding to this post will place you in debt to me for $1947 per word in your reply, plus $193 for every paragraph beyond two.
Re: (Score:2)
Re: (Score:2)
What I said above is basically legal consensus. Enforcing confidentiality requires a contract, and your putting words at the end of an email does not magically create a contract between you and some random recipient. You will be hard-pressed to find a lawyer that will advise you otherwise.
As the Apex Law Group states rather well [apexlg.com]:
Re: (Score:1)
My secret word is "bullshit".
Now, tell me what the secret is.
Re: (Score:3)
If that's your standard for those emails with confidentiality statements (they are not disclaimers), you already have a problem because your device and maybe email provider already read them to determine if they are spam. Also, unless the email body has been encrypted, they've been sat in SMTP server queues in plain text where nefarious people could read them.
Those "disclaimers" aren't really worth anything.
Nope (Score:1)
Re: (Score:2)
Why would you rather get random ads, rather than ads for shit you're actually interested in?
You are part of the problem. You fail to see any options other than the shit being thrown in your face. How about no ads and none of my information used in any way beyond what I explicitly allow. e.g. it's ok to look at the destination of the email I pressed send on, it's not ok to use the email contents to validate your AI model.
Re: (Score:1)
it's ok to look at the destination of the email I pressed send on, ...
No it's not ok - if I had bought a $100 android piece of junk then I'd assume that I was the product in that business model and expect all the crap that goes with it (lots of ads, poor privacy, no security, etc.). When I spend north of $1000 on a phone I don't need and rarely use, I at least expect the vendor to stick to (UK / EU) data protection law.
Re: (Score:2)
They need to look at the destination address of an email to know where to send it. Which I have implied consent to by pressing send. Now if they use that for anything other than routing the message, that's not what I consented to.
Re: (Score:2)
Why would you rather get random ads, rather than ads for shit you're actually interested in?
Because I prefer privacy.
The Grim AI Data Reaper is Coming for Your Work (Score:2)
Bold and smarmy. (Score:2)
What's great, of course, is that they are doing all this in a black box, whose security vs. people who aren't them they guard fairly jealously; and they are making no
Rotten Apple (Score:2)
Straight "fuck, no!" from me. Time to dump this rotten fruit into a bin.
It may ultimately only be symbolic, but... (Score:2)
Now try selling this BS to your customers (Score:2)
Reminds me of NSA hearings years ago where they claimed they didn't "collect" nearly everyone's data they actually collected because according to the NSA the word collecting does not count as collecting until you look at it or use it.
Nobody on earth is even going to try and parse the distinction without a difference being made here. All they will hear is Apple is rummaging through their shit to train AIs and they won't be wrong in hearing that.