Apple Pushes AI Boundaries with New Open-Source Language Models

Apple's Machine Learning team releases cutting-edge language models that outperform industry leaders, showcasing their commitment to AI innovation and community collaboration.

iOS - 22-07-2024 02:39

Apple is making significant strides in the artificial intelligence (AI) sector with the release of two new open-source language models by its Apple Intelligence research team. These high-performing models are part of the industry's ongoing DataComp for Language Models project, which Apple is actively participating in alongside other major tech companies.

The newly released models by Apple have been designed to train AI generators and have already proven to be formidable contenders, matching or surpassing other leading models such as Llama 3 and Gemma. Language models like these are essential for training AI engines, including popular platforms like ChatGPT, by providing a robust framework of architecture, parameters, and high-quality datasets.

Apple's latest contributions to the project include two models: a larger one with seven billion parameters and a smaller one with 1.4 billion parameters. The larger model has notably outperformed the previous top model, MAP-Neo, by 6.6 percent in benchmarks. Even more impressively, the DataComp-LM model developed by Apple's team uses 40 percent less computing power while achieving these benchmarks, making it the best-performing model among those utilizing open datasets and competitive against those with private datasets.

In a significant move for the AI research community, Apple has made its models fully open-source. This includes the datasets, weight models, and training code, all available for other researchers to utilize and build upon. Both models have scored highly in the Massive Multi-task Language Understanding benchmarks (MMLU), positioning them as competitive alternatives to commercial models.

Apple's Machine Learning team debuted these advancements during the WWDC conference in June, effectively silencing critics who previously claimed that Apple was lagging behind in AI applications for its devices. Research papers published by the team before and after the conference further solidified Apple's status as a leader in the AI industry.

Importantly, the models released by Apple's team are not intended for use in future Apple products. Instead, they are part of community research projects aimed at demonstrating improved effectiveness in curating small or large datasets used to train AI models. This initiative highlights Apple's commitment to advancing AI research and fostering collaboration within the AI community.

For those interested in exploring these models and the extensive research conducted by Apple's Machine Learning team, all related datasets, research notes, and other assets are available on HuggingFace.co, a platform dedicated to expanding the AI research community.

MOST READ