Bluesky

Facebook

Nvidia’s Cosmos is one world model being trained on physics data about real world environments.Credit: NVIDIA Corporation

An ongoing trend in artificial intelligence (AI) could have huge implications for how the technology is used in research.

Machine-learning systems such as large language models (LLMs), which turn prompts into text, images and video, are becoming increasingly sophisticated and continue to make astonishing progress, including in science. But such ‘generative AI’ tools also have limitations. The approach does not always make accurate predictions about the physical world, and could fail at modelling correctly what would happen if a car were to go off the edge of a cliff, for example. This would have implications for developing effective and safe AI-powered robots and self-driving vehicles.

Some researchers, including the computer scientist and AI pioneer Yann LeCun, who founded the firm Advanced Machine Intelligence (AMI) Labs in Paris, have turned their attention to a different type of AI tool, developing systems known as ‘world models’ that are trained on real-world data and can embody virtual, interactive and 3D environments.

The approach is attracting huge investment and business interest. AMI Labs — which is taking a radical approach to world models — has raised more than US$1 billion, a record initial infusion of money for a European company. Technology giants such as Google and Nvidia are also developing world models, as are several other start-up companies.

What is a world model?

There are several different definitions of what a world model is. In the broadest sense, any neural network trained on data about the real world (or even about some alternative universe) has some sort of model of a world embedded in it. But over the past two years or so, many researchers have begun to use the term to describe AI that can produce a consistent, explorable and often interactive world that is reminiscent of a first-person video game. A world model has to ‘know’ enough about physics that if the user pushes an object off a table then the object will fall down.

World models also provide a more-interactive experience for a user than does generating images of video material from text prompts. For example, Google Deepmind’s world model Genie 3, which the company released in August 2025, uses simple text descriptions to generate photorealistic environments that can be explored in real time.

The AI revolution is coming to robots: how will it change them?

‘World models’ are AI’s latest sensation: what are they and what can they do?