Apple to analyze on-device data for AI training, vows to uphold user privacy

midian182

Posts: 11,708   +177
Staff member
A hot potato: The idea of Apple analyzing data from users' devices to train its AI models isn't going to be welcomed by most people. Nevertheless, the company is taking this action as it looks to improve its Apple Intelligence services. However, Cupertino says its unique approach to this process will protect user privacy.

Apple writes that its "differential privacy" approach works by first generating synthetic data to mimic the format and important properties of user data, such as emails.

"When creating synthetic data, our goal is to produce synthetic sentences or emails that are similar enough in topic or style to the real thing to help improve our models for summarization, but without Apple collecting emails from the device," the company said in a post.

The synthetic data is converted into what Apple calls an embedding, a numerical representation that contains key attributes such as language, topic, and length.

Apple then randomly polls devices of users who have agreed to share information with the company. It sends the embeddings of its synthetic data to a small section of users who have opted-in to Device Analytics.

Participating devices then select a sample of recent user emails and compute their embeddings. Each device decides which of the synthetic embeddings is closest to these samples. This determines how accurate Apple's models are, improving them if needed.

Apple emphasizes that it does not collect emails or texts from users, and only sees commonly used prompts. Moreover, data from a device is not associated with an IP address or any ID that could be linked to an Apple Account.

According to the post, differential privacy allows Apple to train its AI models to create better text outputs in features like email summaries, while protecting privacy.

Apple is using differential privacy to improve its Genmoji models and will eventually use it for Image Playground, Image Wand, Memories Creation, Writing Tools, and Visual Intelligence.

Apple will roll out the AI training system in an upcoming beta version of iOS and iPadOS 18.5 and macOS 15.5.

The iPhone maker has long prided itself on being a bastion of user privacy. While this differential privacy approach is certainly better than AI companies simply scraping people's data, it could result in Apple Intelligence not being as effective as rival AI platforms.

Permalink to story:

 
It's time there is a law to fully opt-out for things like telemetry and other services that are using consumers as potential data mining project.
 
It's time there is a law to fully opt-out for things like telemetry and other services that are using consumers as potential data mining project.
A better law would be that data collection is illegal unless a user explicitly opts-in for a specific form of tracking. This will save us all the id1otic cookie prompts etc.
 
So what happens if you have yours or some else's Intellectual Property or copyrighted materials on your device, can Apple legally train their AI with that data? This is the same issue that is plaguing other AI system with lawsuits for using copyrighted books and articles to train their AIs.
 
Back