Google I/O 2026: Not Its Smartest Model — Its Cheapest, Fastest One, and an All-In Bet on Agents

Google I/O 2026
Google I/O 2026 Google.com

Google I/O 2026 was, almost in its entirety, a bet on the "agentic era." Across roughly two hours at Shoreline Amphitheatre, Sundar Pichai, Demis Hassabis and a rotating cast walked through a model, a world simulator, a developer platform, a personal agent, a rebuilt Search and an agentic-commerce stack. The most revealing thing about the keynote is what the flagship was not: not "Gemini 4.0," but a fast, cheap workhorse called Gemini 3.5 Flash.

The real weapon is economics, not a number

Google made Gemini 3.5 Flash generally available the same day, pitching it as its strongest agentic and coding model, roughly four times faster than comparable frontier models at less than half the price. Pichai's own framing was unusually blunt about the ceiling: 3.5 Flash sits at "almost 90%" of frontier performance, not above it. The argument is cost, not supremacy. As VentureBeat reported, Google's pitch is that a customer processing a trillion tokens a day could save more than $1 billion a year by shifting 80% of workloads to a Flash-and-frontier blend. Gemini 3.5 Pro, the higher-capability sibling, is not out — Pichai said it arrives next month. That sequencing matters: Google led with the economical model, not the smartest one, which tells you where it thinks the competitive battle is.

Gemini Omni: an AGI pitch with an early-product asterisk

Demis Hassabis unveiled Gemini Omni, framed as a "world model" that simulates physics — kinetic energy, gravity — rather than only predicting text, and demonstrated it generating a claymation explainer and editing video through conversation. Gemini Omni Flash began rolling out immediately to AI Plus, Pro and Ultra users in the Gemini app and Flow, with YouTube Shorts to follow. The honest caveat, stated by Google itself: more substantial Omni updates are "coming later this year," meaning what shipped is an early, fast variant, not the full world model the AGI rhetoric implies.

Agents everywhere

The agentic core was Antigravity 2.0, the standalone desktop platform led by Varun Mohan. Its showcase was a stress test in which agents built a working operating system entirely from scratch — a demo VentureBeat independently confirmed Google described, with Google citing 93 parallel subagents and a sub-$1,000 API bill. Those specific figures are Google's own demo claims and were not independently verified on stage. Google also said an optimized Flash runs up to 12 times faster inside Antigravity, a number CTO Koray Kavukcuoglu repeated to reporters.

Consumer agents arrived as Gemini Spark, a 24/7 personal agent that runs on dedicated Google Cloud virtual machines so tasks continue with the laptop closed, demonstrated organizing a block party by reading HOA rules from Drive, building a Sheets RSVP tracker and chasing replies over Gmail. Spark — reportedly the project leaked as "Remy" — enters beta for Google AI Ultra subscribers, with Ultra pricing announced at $100 and $200 tiers (a figure to treat as announced, not independently confirmed).

Search got what Google called its biggest change in about 25 years: AI Mode now runs on Gemini 3.5, the search box accepts multimodal input, and "Information Agents" monitor the web in the background — stock criteria, sneaker drops, apartment listings — and push synthesized updates. Search can also code custom mini-apps on the fly. On commerce, Google introduced the Universal Commerce Protocol and Agent Payments Protocol plus a cross-service Universal Cart — an open standard Google says Amazon, Meta, Microsoft, Salesforce and Stripe have adopted, consistent with the agentic-commerce standardization already underway across the industry.

Hardware was lighter: Android XR eyewear with Samsung, Qualcomm, Warby Parker and Gentle Monster was confirmed for this fall, with an on-stage demo of hands-free navigation, an agent ordering a "usual" cold brew via DoorDash on a pocketed phone, and a Nano Banana photo-to-cartoon trick. On trust, Google expanded SynthID watermarking — now adopted by NVIDIA, OpenAI, ElevenLabs and Kakao — and launched Content Credentials checking in Search and Chrome, alongside science efforts including Gemini for Science, WeatherNext and Isomorphic Labs' drug work.

Fact-check: the recap numbers that don't hold up

The widely circulating summary of the keynote contains one significant overstatement worth correcting. Several recaps claim Google said it now processes "3.2 quadrillion tokens per month" (up from 480 trillion a year earlier) and "19 billion tokens per minute." Google's own published keynote materials do not support the quadrillion figure. Pichai's official I/O post frames scale differently: top companies process about one trillion tokens a day, and Google's internal Antigravity usage went from roughly half a trillion tokens a day in March to more than three trillion a day now. The most recent first-party API figure Google has published — from Cloud Next in April — was more than 16 billion tokens per minute, up from 10 billion the prior quarter. The "$180–190 billion" 2026 capital expenditure and the dual-chip eighth-generation TPUs (8t for training, 8i for inference, three times Ironwood's power) are accurate. But the headline token statistic in many write-ups appears to be an extrapolation, not a quote — and the OS-build cost, the 1,500-tokens-per-second demo and SynthID's "100 billion images" are Google's stated figures, not third-party measurements.

The strategic read

Asked about the frontier amid recent rival advances, Pichai conceded the landscape is "very dynamic" — a notable hedge in a year when OpenAI's GPT-5.5, Anthropic's restricted Mythos and Cursor's cut-price Composer have reset what "leading" means. Google's answer was not a claim to the smartest model. It was a claim to the most economical agentic one, shipped the same day across Search, the Gemini app's 900-million-plus users, Workspace and Android. The bet is that in an agentic era where a single task can burn millions of tokens, distribution and price per token decide the market more than a benchmark crown. Whether Gemini 3.5 Pro next month, and the full Omni world model later this year, deliver the intelligence half of that equation is the question I/O 2026 deliberately deferred.

ⓒ 2026 TECHTIMES.com All rights reserved. Do not reproduce without permission.

Join the Discussion