Apple Denies Using Unethically Sourced Data for Apple Intelligence Training

Apple Confirms Ethical Data Practices and Clarifies Usage of OpenELM Models

iOS

18-07-2024 01:54

466 Hit

Apple has recently addressed concerns regarding the use of unethically sourced data in training its AI systems. This clarification follows reports that an AI research lab, EleutherAI, had harvested subtitles from YouTube videos without creators' consent, along with data from Wikipedia, the English Parliament, and Enron staff emails, forming a dataset known as "the Pile."

Key Points of Clarification:

EleutherAI's Dataset Usage: The Pile was created by EleutherAI to lower the barrier for AI development outside Big Tech. While companies such as Nvidia, Salesforce, and Apple have utilized the Pile for various AI projects, Apple has specified that this dataset was not used for training Apple Intelligence.
OpenELM Models: Apple confirmed that the Pile dataset was employed to train its open-source OpenELM models, released in April. However, these models do not power any of Apple’s AI or machine learning features. Instead, OpenELM was developed to contribute to the broader research community.
Ethical Data Practices: Apple reiterated its commitment to ethical data sourcing for its artificial intelligence projects. The company emphasized that it does not use unethically obtained data and has invested millions in obtaining licensed content from publishers and photo library firms.
No Integration with Apple Intelligence: Apple stated that OpenELM models were never intended for use in Apple Intelligence and that there are no plans to develop new versions of the OpenELM model. This clear separation underscores Apple's stance on maintaining ethical standards in its AI development.

Ethical Commitment:

Apple’s response highlights its dedication to transparency and ethical practices in AI training. By clarifying the distinct use of the Pile dataset for OpenELM and not for Apple Intelligence, Apple aims to reassure users and stakeholders of its integrity in data sourcing and AI development.

Apple's clarification underscores its commitment to ethical practices in AI training, distancing its primary AI projects from the controversy surrounding EleutherAI's data harvesting methods. This move aligns with Apple’s longstanding reputation for prioritizing user privacy and ethical standards in its technological advancements.