
Large language models have come a long way in a very short space of time, powering all kinds of innovations in the digital world. But these algorithms have yet to make a dent in the physical world. Enter Decart, which has just unveiled its most advanced world model, designed to enable autonomous robots and other systems to interact with real-world environments and accelerate the momentum behind "physical AI."
Decart's new model is called Oasis 3, and it's nothing like the legions of LLMs that excel in tasks like text generation, debugging code, and mastering chess. Instead, Oasis 3 aims to construct the virtual playgrounds needed for autonomous drones, humanoid robots, and vehicles to master the laws of physical reality.
LLMs have captured the attention of billions of people globally, amazing us with their ability to instantly generate speeches, spin up promotional videos, and create powerful code for almost any kind of software application. But these models fall short when it comes to building a truly intelligent entity that can perform tasks in the real world.
While LLMs know all about language syntax and grammar, considerations like gravity and textures are a whole different ball game. An LLM can be taught that a glass will break if it's dropped onto the floor, but that doesn't mean it knows how far that glass needs to fall to make certain it shatters into thousands of tiny pieces. This is where vision-language-action (VLA) models come in.
Simulating Variants of Reality for Training
Unlike LLMs, VLA models are grounded in the intricacies of dynamic real-world physics. Physical AI has unique cognitive demands that are quite different from what LLMs are good at. In essence, they need to be able to perceive an environment, process their objectives, and then translate their findings into commands they can execute using advanced motor skills.
The difference between them is stark. The core focus of physical AI models is spatial awareness rather than semantic logic and syntax. They feed on multi-perspective video and actions for their primary inputs, instead of text tokens and symbols. And the best way to generate these inputs is through interactive simulations that support the scale necessary to teach AI to react to almost any kind of situation.
Decart says it has developed Oasis 3, a "world model," in order to make up for the drastic shortage of data needed to train intelligent robots and cars to function among humans. For AI systems to learn, they need to be trained on massive amounts of information, but unlike LLMs, they don't have an easily accessible public resource such as the internet.
To generate that data, AI systems need to be able to simulate real-world actions so they can learn through a series of action-consequence loops, but that can only happen if the simulations are an accurate representation of reality in all its variations.
Moreover, the bar for learning is much higher. While a text-generation model can usually get away with the odd hallucination, an autonomous taxi ferrying passengers across a city simply cannot take that chance.
Getting Robots Ready for Anything
Oasis 3 builds upon Decart's earlier interactive video game-focused engines, Oasis 1 and Oasis 2, and is a much more complete world model that acts as a generative engine for infinite physical experiences. Whereas the original Oasis models output blocky Minecraft-style computer graphics, Oasis 3 leverages Nvidia's infrastructure to stream continuous, photorealistic simulated worlds that are grounded in hyper-realistic physics and responsive to anything a robot tries to do.
These closed-loop simulations enable robots to engage with almost any kind of environment and develop a deep understanding of it through simple trial-and-error. Because Oasis 3 can run thousands of training sessions at once, it can be used to generate millions of hours of simulated data in real time. It spits out video simulations at a rapid 22 frames per second, and responds to any input with less than 200ms latency, ensuring that its feedback loops are virtually instantaneous.
Crucially, it overcomes the most critical flaw of earlier simulators, enhancing the visual quality without breaking the laws of physics. Spatial relationships are preserved through the synchronization of three camera views, ensuring that if a robot extends its arm towards an object, the depth, reflections, shadows, and shifting perspective remain perfectly aligned across every viewpoint.
According to Decart, the most important litmus test for physical AI systems is their ability to handle the seemingly random, low-probability events that occur with alarming regularity in the real world. This is where traditional code-based systems have been known to fail, because it's basically impossible to pre-program an algorithm to anticipate every conceivable anomaly.
During their training runs, developers can use Oasis 3 to conjure up all kinds of chaos within a simulation using only natural language commands. Someone trying to teach an autonomous car how to navigate city streets safely could tell it to add a few tornadoes, some oil slicks, and make a pedestrian suddenly stumble into the oncoming vehicle, so it can practice how to avoid these types of hazards.
Upon being prompted to do so, Oasis 3 will immediately generate the requested anomalies, so that the car's VLA brain can attempt to work out what to do through trial and error. In the unlikely event that a similar scenario unfolds in real life, the model will know exactly how to react. After all, you never know what Mother Nature might do next, and physical AI systems have to be ready for anything.
Oasis 3 is now available through its API platform. It runs on CoreWeave's cloud platform, providing a low-cost way for developers to access advanced simulated universes at scale.
ⓒ 2026 TECHTIMES.com All rights reserved. Do not reproduce without permission.




