Google Expands Gemini AI With Audio Output, Multilingual Support, and Long-Form Reasoning

wavy blue lines across a black background — Gemini 2.5 gets a voice. Google

At Google I/O on Tuesday, the big tech company announced even more updates to its latest AI model, Gemini 2.5.

Today's updates include new capabilities to 2.5 Pro and 2.5 Flash: native audio output to bring its interface alongside other LLMs like Open AI's ChatGPT. Google also announced advanced security safeguards, an ability to navigate and interact with web pages on your behalf called Project Mariner, and a new experimental reasoning feature called Deep Think, which works with Gemini 2.5 Pro (originally announced in March).

You can already use Gemini 2.5 Flash in the Gemini app itself, and the updated version will show up in Google AI Studio for devs and Vertex AI for enterprise customers in early June. Gemini 2.5 Pro should be available soon after that, says the company.

This latest Pro version, says Google, has been updated to help developers build richer interactive web apps and is leading various coding leaderboards and academic benchmarks for AI performance.

New Gemini 2.5 abilities include a preview version of audio-visual input and native out dialogue via the Live API, so developers can add full conversation experiences with Gemini. Some of the early features include Affective Dialogue, where the model notices the speaker's emotions and responds to them, and Proactive Audio, where the model will ignore background convos and support for more complex thinking tasks.

Deep Think, currently in testing, is what Google calls an "enhanced reasoning mode" that uses new research techniques that let the LLM consider multiple hypotheses before responding. It's available only to "trusted testers" who have access to the Gemini API before it becomes more widely available.

"Because we're defining the frontier with 2.5 Pro DeepThink," senior director of product management Tulsee Doshi wrote in a blog post, "we're taking extra time to conduct more frontier safety evaluations and get further input from safety experts.

Gemini 2.5 Flash, designed for speed and lower cost is now better in many ways, too, said Google. The company promises that Flash has gotten better in reasoning, multimodality, code, and long context while also becoming more efficient, claiming a 20% - 30% reduction in token use. Flash will be available to preview via AI Studio, Vertext AI, and the Gemini App, with an early June date set for live production.

Tags:Google Google Gemini Google Gemini AI LLMs

Join the Discussion

Google Expands Gemini AI With Audio Output, Multilingual Support, and Long-Form Reasoning

Real-time speech, cross-language fluency, and AI that can analyze information at scale

10 Hidden Android Tricks and Settings Tips in Android 2025 That Instantly Boost Your Phone Experience

Waymo Expands to 5 Cities Including Miami, Dallas Next Year

'Call of Duty: Black Ops 7 Zombies' Ultimate Beginners Guide to Repairing and Upgrading Ol' Tessie

12 Must-Have Free Apps for 2025: Boost Your Workflow with the Best Productivity & Mobile Tools

First Look at Nintendo's Live-Action 'The Legend of Zelda' Movie Starring Bo Bragason and Benjamin Evan Ainsworth