Genie 3 Trains Waymo’s Robotaxis on Rare Scenarios: Street View Grounding Now Global for $200 Ultra Subscribers

Google DeepMind Connects 280 Billion Street View Images to World Model, Opening Robotics Pipeline to Consumer Subscribers

Waymo robotaxi is seen on Centre Street on April 09,
Waymo robotaxi is seen on Centre Street on April 09, 2026 in New York City. Michael M. Santiago/Getty Images

Google DeepMind opened Street View grounding inside Project Genie globally on May 19, 2026, giving subscribers who pay $200 per month for the company's top-tier Google AI Ultra plan their first access to a generative world model capable of building interactive, navigable environments anchored to real U.S. locations — the same underlying engine that Waymo, Alphabet's robotaxi unit, already uses to train its autonomous vehicles on dangerous scenarios that rarely happen on public roads. The launch was announced at the Google I/O 2026 developer conference.

That announcement also coincided with Google cutting the Ultra 20x plan's monthly price from $250 to $200. Project Genie remains exclusive to that higher tier; subscribers to the newly introduced $99.99 Ultra plan do not have access.

Street View as World-Building Substrate

Users on the $200 tier can now tap a Maps pin, choose any location in the United States, pick a visual style — "Desert Sands," "Stone Age," "Ocean World," or others — describe a character, and receive a 60-second, 720p, 24-frames-per-second navigable environment that uses Street View imagery as its structural foundation. The resulting worlds are stylized rather than photorealistic: scuba-dive around a submerged Golden Gate Bridge, or walk through the Fort Worth Stockyards rendered in black-and-white with a silent-film aesthetic. Geographic coverage for Street View grounding starts with U.S. locations; Google has said broader expansion is planned.

The feature runs on a Google DeepMind technology called Maps Imagery Grounding. The Street View archive spans 280 billion images collected across 110 countries and all seven continents over nearly two decades. No other AI lab holds a comparable dataset, making it a structural competitive advantage for any simulation application Google chooses to build on top of it.

Spatial Continuity: What Separates Genie from Prior World Models

Jonathan Herbert, Principal Product Manager at Google Maps, said during the I/O demonstration that Genie 3 cannot yet create a faithful reconstruction of any given street. What it can do, he said, is maintain spatial continuity: spin 360 degrees inside a generated environment and the model correctly remembers the scene behind you rather than regenerating it from scratch. Herbert described this memory-consistent spatial awareness as the actual technical breakthrough, separate from the visual results users see on screen.

Genie 2, which debuted in late 2024, could hold a scene in memory for roughly 10 seconds before losing coherence. Genie 3, released as a research preview in August 2025, extended that window to several minutes and added real-time interactivity, meaning the model renders the path ahead as the user moves rather than pre-computing a static environment. The jump from seconds to minutes of persistent memory is what makes the model practically useful for training autonomous agents — robots and self-driving cars need far more than a ten-second simulation horizon to learn reliable behavior.

How Waymo Uses This Engine Today

The connection between a consumer world-building tool and Alphabet's robotaxi fleet is not hypothetical. Genie 3 already powers one of Waymo's simulators, where the company uses it to train its autonomous driver on edge cases that would be dangerous, illegal, or logistically impossible to stage on real roads: tornadoes, animals crossing highways, simultaneous equipment failures.

Jack Parker-Holder, a research scientist on DeepMind's open-endedness team, explained the Street View integration's specific value for this use case. Waymo's existing simulators are locked to the vehicle's point of view. Street View allows the same simulation engine to shift perspective to other agents — a pedestrian, a cyclist, a delivery robot — opening the possibility of training multi-agent scenarios that today's car-centric simulators cannot produce.

Parker-Holder illustrated the robotics case with a concrete example: a robot deployed in London, a city that rarely sees direct sun, can use Genie to pre-simulate the rare occasions when sunlight glints off Victorian-era buildings. Without that training data, the robot risks disorientation the first time it encounters the actual glare. Real-world events that are too rare to appear in standard training sets are precisely where generative world models earn their value.

Known Limitations at Launch

Diego Rivas, Group Product Manager at Google DeepMind, was clear at I/O that the Street View integration remains an experiment. Accuracy still needs improvement. The environments currently render at video-game quality rather than photorealistic fidelity. Physics simulation is incomplete: a character running through a Genie-generated Joshua Tree landscape passes through cacti and brushes as if they are not there.

Independent analysis of Genie 3 has flagged a structural issue that researchers at TechTalks described as a reliability paradox: if the simulated environments contain physics inaccuracies, any agent trained on them may develop behaviors that fail in the real world. Google's own researchers have acknowledged this, arguing that even an imperfect simulator provides value by revealing which agent behaviors break down under degraded conditions — failure in simulation is still informative, even if the simulation is not perfectly faithful.

An additional constraint: the model currently supports only a few minutes of continuous interaction, while meaningful agent training typically requires hours-long simulation runs. Genie 3's lead researchers identified this gap explicitly when the model launched in August 2025.

Competitive Landscape: Dataset No Rival Can Replicate Quickly

World Labs, founded by Stanford researcher Fei-Fei Li, released a competing world model called Marble in November 2025, offering commercial access through a freemium structure priced up to $95 per month. Runway, the AI video generation company, launched its own world model in December 2025 with a focus on cinematic applications. Nvidia and Cadence are both building simulation pipelines targeting robotics training from a hardware-first perspective.

None of those competitors arrived with anything approaching 280 billion real-world images collected across two decades. The Street View archive is the specific asset that distinguishes Google's approach: it provides geographic specificity and temporal depth that synthetically generated training data cannot match.

Access Details

Project Genie with Street View grounding is rolling out globally to Google AI Ultra subscribers on the $200-per-month plan (18 or older). Street View location support covers U.S. locations at launch, with other geographies coming later. The $200 Ultra 20x plan, which also bundles YouTube Premium and 20 terabytes of storage, was reduced from $250 at the same I/O keynote at which the Street View feature was announced. Google AI Ultra starts at $99.99 per month with a separate feature set; Project Genie is not included at that price.

Project Genie remains a Google Labs research prototype. Google has emphasized the product is not a game engine: it lacks persistent mechanics, narrative flow, and the production controls that game development requires.

ⓒ 2026 TECHTIMES.com All rights reserved. Do not reproduce without permission.

Join the Discussion