Just say, “Turn on the bright lights in the living room,” and the smart home adjusts to your preferences. With one sentence, you can also play music or set an alarm. This is all thanks to an intelligent assistant that truly understands the context of your commands and promises to revolutionize the way we communicate with devices and applications. This is the promise of Apple’s new ReALM language model. This advanced artificial intelligence system can recognize the meaning of conversational references, read the context of displayed content, and understand the background of current device processes. Are these just promises, or is a truly new level of interaction with voice assistants coming? Read on to find out more.
ReALM – table of contents
What is ReALM?
ReALM stands for “Reference Resolution As Language Modeling,” a groundbreaking solution developed by Apple researchers. It is thus a new language model (Large Language Model, LLM) that treats the problem of reference recognition as a task in the field of language modeling.
ReALM effectively converts various types of context into a textual representation, which it then processes as part of a language task. This can include:
- conversations – such as text messages, voice commands to an assistant, or emails,
- elements on the screen – for example, photos, calendar, weather widget, or applications and processes running in the background.
What makes ReALM different from other reference recognition models? First, the approach – instead of relying on image processing, ReALM runs in the text domain. This makes it much lighter and more efficient, which should allow it to run directly on mobile devices while maintaining user privacy.
In what ways is ReALM better than GPT-4?
Apple’s research team compared ReALM to the most powerful language models on the market today – GPT-3.5 and GPT-4 from OpenAI. The results were impressive. In reference recognition tasks, the smallest ReALM variant achieved accuracy comparable to GPT-4! The larger ReALM models even outperformed GPT-4 in recognizing references to items displayed on the screen (http://arxiv.org/abs/2403.20329).
What explains this advantage? First, ReALM is great with domain-specific queries, such as those concerning smart home appliances. This is because ReALM demonstrates a deeper understanding of context by fine-tuning the model for domain-specific data.
What’s more, unlike GPT-4, which trains primarily on images of real objects, ReALM excels at recognizing textual elements and components of application user interfaces. And it is interface understanding that is critical to the smooth interaction of voice assistants with the applications we use today.
Source: DALL·E 3, prompt: Marta M. Kania (https://www.linkedin.com/in/martamatyldakania/)
Is this the beginning of the era of truly intelligent assistants?
Indeed, the integration of ReALM with Siri could open a whole new chapter in human-computer interaction. With ReALM, Siri will be able to understand commands that include references to items displayed on the smartphone screen, as well as processes and applications running in the background. But when will this option be available to users? That is still unknown.
We are left with speculation based on the technical capabilities of the model. So how might a ReALM-powered Siri work? For example, if you’re browsing a business listings site and see a company you’re interested in, you could simply say to Siri, “Call this company,” and the assistant – using ReALM to analyze context – will find the phone number of the company you specify and initiate the call. You don’t even have to explain exactly which company you mean.
A to dopiero początek możliwości ReALM. Polecenia takie jak „Odtwórz ostatnią playlistę” pozwoliłyby na intuicyjną kontrolę aplikacji multimedialnych i urządzeń inteligentnego domu. ReALM mógłby też umożliwić Siri rozumienie kontekstu rozmów i historii poleceń, aby asystent reagował adekwatnie do wcześniejszych żądań użytkownika. To krok w stronę inteligentnych agentów przybliżający nas nie tyle do sztucznej inteligencji rozumiejącej nasze zapytania, ile do takiej, która będzie umiała realizować polecenia.
And this is just the beginning of what ReALM can do. Commands like “play the last playlist” would enable intuitive control of media applications and smart home devices. ReALM could also enable Siri to understand the context of conversations and command history, so that the assistant responds appropriately to the user’s previous requests. This is a step toward intelligent agents, moving us closer to not an artificial intelligence that understands our requests, but one that knows how to execute commands.
Unfortunately, users of Android devices will have to wait. Currently, there is no information about Google’s plans to add Gemini’s capabilities to Google Assistant. A Google Gemini app for Android devices has been developed (https://play.google.com/store/apps/details?id=com.google.android.apps.bard&hl=en_US), but it is not yet available outside the United States
Source: Google Play (https://play.google.com/store/apps/details?id=com.google.android.apps.bard&hl=en_US)
Summary
ReALM is Apple’s innovative approach to solving the problem of context recognition by voice assistants. Instead of relying on image processing, this language model converts different types of context into a textual representation, which it then processes in a language task. This approach ensures not only high recognition accuracy, but also the ability to operate on a mobile device while maintaining user privacy.
Giving Siri access to ReALM can provide more natural and contextual voice interactions, an important step toward truly intelligent assistants. With ReALM, Siri will be able to instantly respond to commands related to screen items, applications, and background processes. One thing is certain – improving the contextual awareness of assistants is the key to creating truly intelligent and natural voice interactions, and ReALM is undoubtedly an important step in that direction.
If you like our content, join our busy bees community on Facebook, Twitter, LinkedIn, Instagram, YouTube, Pinterest, TikTok.
AI in business:
- Threats and opportunities of AI in business (part 1)
- Threats and opportunities of AI in business (part 2)
- AI applications in business - overview
- AI-assisted text chatbots
- Business NLP today and tomorrow
- The role of AI in business decision-making
- Scheduling social media posts. How can AI help?
- Automated social media posts
- New services and products operating with AI
- What are the weaknesses of my business idea? A brainstorming session with ChatGPT
- Using ChatGPT in business
- Synthetic actors. Top 3 AI video generators
- 3 useful AI graphic design tools. Generative AI in business
- 3 awesome AI writers you must try out today
- Exploring the power of AI in music creation
- Navigating new business opportunities with ChatGPT-4
- AI tools for the manager
- 6 awesome ChatGTP plugins that will make your life easier
- 3 grafików AI. Generatywna sztuczna inteligencja dla biznesu
- What is the future of AI according to McKinsey Global Institute?
- Artificial intelligence in business - Introduction
- What is NLP, or natural language processing in business
- Automatic document processing
- Google Translate vs DeepL. 5 applications of machine translation for business
- The operation and business applications of voicebots
- Virtual assistant technology, or how to talk to AI?
- What is Business Intelligence?
- Will artificial intelligence replace business analysts?
- How can artificial intelligence help with BPM?
- AI and social media – what do they say about us?
- Artificial intelligence in content management
- Creative AI of today and tomorrow
- Multimodal AI and its applications in business
- New interactions. How is AI changing the way we operate devices?
- RPA and APIs in a digital company
- The future job market and upcoming professions
- AI in EdTech. 3 examples of companies that used the potential of artificial intelligence
- Artificial intelligence and the environment. 3 AI solutions to help you build a sustainable business
- AI content detectors. Are they worth it?
- ChatGPT vs Bard vs Bing. Which AI chatbot is leading the race?
- Is chatbot AI a competitor to Google search?
- Effective ChatGPT Prompts for HR and Recruitment
- Prompt engineering. What does a prompt engineer do?
- AI Mockup generator. Top 4 tools
- AI and what else? Top technology trends for business in 2024
- AI and business ethics. Why you should invest in ethical solutions
- Meta AI. What should you know about Facebook and Instagram's AI-supported features?
- AI regulation. What do you need to know as an entrepreneur?
- 5 new uses of AI in business
- AI products and projects - how are they different from others?
- AI-assisted process automation. Where to start?
- How do you match an AI solution to a business problem?
- AI as an expert on your team
- AI team vs. division of roles
- How to choose a career field in AI?
- Is it always worth it to add artificial intelligence to the product development process?
- AI in HR: How recruitment automation affects HR and team development
- 6 most interesting AI tools in 2023
- 6 biggest business mishaps caused by AI
- What is the company's AI maturity analysis?
- AI for B2B personalization
- ChatGPT use cases. 18 examples of how to improve your business with ChatGPT in 2024
- Microlearning. A quick way to get new skills
- The most interesting AI implementations in companies in 2024
- What do artificial intelligence specialists do?
- What challenges does the AI project bring?
- Top 8 AI tools for business in 2024
- AI in CRM. What does AI change in CRM tools?
- The UE AI Act. How does Europe regulate the use of artificial intelligence
- Sora. How will realistic videos from OpenAI change business?
- Top 7 AI website builders
- No-code tools and AI innovations
- How much does using AI increase the productivity of your team?
- How to use ChatGTP for market research?
- How to broaden the reach of your AI marketing campaign?
- "We are all developers". How can citizen developers help your company?
- AI in transportation and logistics
- What business pain points can AI fix?
- Artificial intelligence in the media
- AI in banking and finance. Stripe, Monzo, and Grab
- AI in the travel industry
- How AI is fostering the birth of new technologies
- The revolution of AI in social media
- AI in e-commerce. Overview of global leaders
- Top 4 AI image creation tools
- Top 5 AI tools for data analysis
- AI strategy in your company - how to build it?
- Best AI courses – 6 awesome recommendations
- Optimizing social media listening with AI tools
- IoT + AI, or how to reduce energy costs in a company
- AI in logistics. 5 best tools
- GPT Store – an overview of the most interesting GPTs for business
- LLM, GPT, RAG... What do AI acronyms mean?
- AI robots – the future or present of business?
- What is the cost of implementing AI in a company?
- How can AI help in a freelancer’s career?
- Automating work and increasing productivity. A guide to AI for freelancers
- AI for startups – best tools
- Building a website with AI
- OpenAI, Midjourney, Anthropic, Hugging Face. Who is who in the world of AI?
- Eleven Labs and what else? The most promising AI startups
- Synthetic data and its importance for the development of your business
- Top AI search engines. Where to look for AI tools?
- Video AI. The latest AI video generators
- AI for managers. How AI can make your job easier
- What’s new in Google Gemini? Everything you need to know
- AI in Poland. Companies, meetings, and conferences
- AI calendar. How to optimize your time in a company?
- AI and the future of work. How to prepare your business for change?
- AI voice cloning for business. How to create personalized voice messages with AI?
- Fact-checking and AI hallucinations
- AI in recruitment – developing recruitment materials step-by-step
- Midjourney v6. Innovations in AI image generation
- AI in SMEs. How can SMEs compete with giants using AI?
- How is AI changing influencer marketing?
- Is AI really a threat to developers? Devin and Microsoft AutoDev
- AI chatbots for e-commerce. Case studies
- Best AI chatbots for ecommerce. Platforms
- How to stay on top of what's going on in the AI world?
- Taming AI. How to take the first steps to apply AI in your business?
- Perplexity, Bing Copilot, or You.com? Comparing AI search engines
- ReALM. A groundbreaking language model from Apple?
- AI experts in Poland
- Google Genie — a generative AI model that creates fully interactive worlds from images
- Automation or augmentation? Two approaches to AI in a company
- LLMOps, or how to effectively manage language models in an organization
- AI video generation. New horizons in video content production for businesses
- Best AI transcription tools. How to transform long recordings into concise summaries?
- Sentiment analysis with AI. How does it help drive change in business?
- The role of AI in content moderation