Blog

ReALM. A groundbreaking language model from Apple? | AI in business #121

What is ReALM?

ReALM stands for “Reference Resolution As Language Modeling,” a groundbreaking solution developed by Apple researchers. It is thus a new language model (Large Language Model, LLM) that treats the problem of reference recognition as a task in the field of language modeling.

ReALM effectively converts various types of context into a textual representation, which it then processes as part of a language task. This can include:

  • conversations – such as text messages, voice commands to an assistant, or emails,
  • elements on the screen – for example, photos, calendar, weather widget, or applications and processes running in the background.

What makes ReALM different from other reference recognition models? First, the approach – instead of relying on image processing, ReALM runs in the text domain. This makes it much lighter and more efficient, which should allow it to run directly on mobile devices while maintaining user privacy.

In what ways is ReALM better than GPT-4?

Apple’s research team compared ReALM to the most powerful language models on the market today – GPT-3.5 and GPT-4 from OpenAI. The results were impressive. In reference recognition tasks, the smallest ReALM variant achieved accuracy comparable to GPT-4! The larger ReALM models even outperformed GPT-4 in recognizing references to items displayed on the screen (http://arxiv.org/abs/2403.20329).

What explains this advantage? First, ReALM is great with domain-specific queries, such as those concerning smart home appliances. This is because ReALM demonstrates a deeper understanding of context by fine-tuning the model for domain-specific data.

What’s more, unlike GPT-4, which trains primarily on images of real objects, ReALM excels at recognizing textual elements and components of application user interfaces. And it is interface understanding that is critical to the smooth interaction of voice assistants with the applications we use today.

Source: DALL·E 3, prompt: Marta M. Kania (https://www.linkedin.com/in/martamatyldakania/)

Is this the beginning of the era of truly intelligent assistants?

Indeed, the integration of ReALM with Siri could open a whole new chapter in human-computer interaction. With ReALM, Siri will be able to understand commands that include references to items displayed on the smartphone screen, as well as processes and applications running in the background. But when will this option be available to users? That is still unknown.

We are left with speculation based on the technical capabilities of the model. So how might a ReALM-powered Siri work? For example, if you’re browsing a business listings site and see a company you’re interested in, you could simply say to Siri, “Call this company,” and the assistant – using ReALM to analyze context – will find the phone number of the company you specify and initiate the call. You don’t even have to explain exactly which company you mean.

A to dopiero początek możliwości ReALM. Polecenia takie jak „Odtwórz ostatnią playlistę” pozwoliłyby na intuicyjną kontrolę aplikacji multimedialnych i urządzeń inteligentnego domu. ReALM mógłby też umożliwić Siri rozumienie kontekstu rozmów i historii poleceń, aby asystent reagował adekwatnie do wcześniejszych żądań użytkownika. To krok w stronę inteligentnych agentów przybliżający nas nie tyle do sztucznej inteligencji rozumiejącej nasze zapytania, ile do takiej, która będzie umiała realizować polecenia.

And this is just the beginning of what ReALM can do. Commands like “play the last playlist” would enable intuitive control of media applications and smart home devices. ReALM could also enable Siri to understand the context of conversations and command history, so that the assistant responds appropriately to the user’s previous requests. This is a step toward intelligent agents, moving us closer to not an artificial intelligence that understands our requests, but one that knows how to execute commands.

Unfortunately, users of Android devices will have to wait. Currently, there is no information about Google’s plans to add Gemini’s capabilities to Google Assistant. A Google Gemini app for Android devices has been developed (https://play.google.com/store/apps/details?id=com.google.android.apps.bard&hl=en_US), but it is not yet available outside the United States

Source: Google Play (https://play.google.com/store/apps/details?id=com.google.android.apps.bard&hl=en_US)

Summary

ReALM is Apple’s innovative approach to solving the problem of context recognition by voice assistants. Instead of relying on image processing, this language model converts different types of context into a textual representation, which it then processes in a language task. This approach ensures not only high recognition accuracy, but also the ability to operate on a mobile device while maintaining user privacy.

Giving Siri access to ReALM can provide more natural and contextual voice interactions, an important step toward truly intelligent assistants. With ReALM, Siri will be able to instantly respond to commands related to screen items, applications, and background processes. One thing is certain – improving the contextual awareness of assistants is the key to creating truly intelligent and natural voice interactions, and ReALM is undoubtedly an important step in that direction.

If you like our content, join our busy bees community on Facebook, Twitter, LinkedIn, Instagram, YouTube, Pinterest, TikTok.

Author: Robert Whitney

JavaScript expert and instructor who coaches IT departments. His main goal is to up-level team productivity by teaching others how to effectively cooperate while coding.

Robert Whitney

JavaScript expert and instructor who coaches IT departments. His main goal is to up-level team productivity by teaching others how to effectively cooperate while coding.

Recent Posts

Sales on Pinterest. How can it help with building your e-commerce business?

Pinterest, which made its debut on the social media scene a decade ago, never gained…

4 years ago

How to promote a startup? Our ideas

Thinking carefully on a question of how to promote a startup will allow you to…

4 years ago

Podcast in marketing: what a corporate podcast can give you

A podcast in marketing still seems to be a little underrated. But it changes. It…

4 years ago

Video marketing for small business

Video marketing for small business is an excellent strategy of internet marketing. The art of…

4 years ago

How to promote a startup business? Top 10 pages to upload a product

Are you wondering how to promote a startup business? We present crowdfunding platforms and websites…

4 years ago

How to use social media to increase sales?

How to use social media to increase sales? Well, let's start like that. Over 2.3…

4 years ago