Apple Boasts New AI, Claims It Outpaces GPT-4: "Large Improvements Over an Existing System"

Apple has unveiled its latest artificial intelligence system, Reference Resolution As Language Modeling (ReALM), asserting that it surpasses GPT-4 in certain aspects, according to a report by TechXplore.

FRANCE-TECHNOLOGY-INTERNET — This picture taken on March 25, 2024, shows the Apple logO on a smartphone in Mulhouse, eastern France. SEBASTIEN BOZON/AFP via Getty Images

Apple Unveils ReALM

In recent years, large language models (LLMs) like GPT-4 have been at the forefront of technological advancements as companies strive to enhance their offerings and attract more users.

Apple, however, has been perceived as trailing behind in this domain, particularly with its Siri digital assistant, which has seen minimal advancements in artificial intelligence capabilities.

The Apple team contends that their ReALM system represents more than just an effort to catch up with competitors; it is positioned as a superior product that outperforms existing LLMs, especially in handling specific types of queries.

According to the paper authored by the Apple team, ReALM stands out in providing more precise responses to user inquiries due to its unique ability to interpret ambiguous on-screen references and access both conversational and background information.

By leveraging contextual cues from the user's screen and ongoing device processes, ReALM aims better to understand the user's intent behind a query, thereby enhancing the accuracy of its responses.

Apple Claims ReALM Surpasses GPT-4

The researchers assert that extensive testing against various LLMs, including GPT-4, has demonstrated ReALM's superior performance in certain tasks.

They further suggest that Apple plans to integrate ReALM into its ecosystem, potentially improving Siri's ability to provide more relevant answers, albeit likely requiring users to upgrade to iOS 18 upon its release later this year.

The researchers highlighted the significance of reference resolution in understanding and effectively handling various contexts in their paper, including both conversational and non-conversational elements such as on-screen entities and background processes.

They underscored the transformative potential of LLMs in resolving references of diverse types, showcasing substantial enhancements over existing systems across different reference categories.

The paper's findings indicate promising results, with ReALM's smallest model achieving comparable performance to GPT-4 and its larger models significantly surpassing it.

This suggests that ReALM could represent a notable advancement in AI technology, particularly in the realm of reference resolution, where traditional LLMs have faced limitations.

"This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality," the researchers wrote.

"We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5% for on-screen references. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it."

The research team detailed their findings on the arXiv preprint server.