General agents need world models
- URL: http://arxiv.org/abs/2506.01622v2
- Date: Mon, 16 Jun 2025 12:07:32 GMT
- Title: General agents need world models
- Authors: Jonathan Richens, David Abel, Alexis Bellot, Tom Everitt,
- Abstract summary: We show that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment.<n>We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models.
- Score: 22.608210395958224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents.
Related papers
- Current Agents Fail to Leverage World Model as Tool for Foresight [61.82522354207919]
Generative world models offer a promising remedy: agents could use them to foresee outcomes before acting.<n>This paper empirically examines whether current agents can leverage such world models as tools to enhance their cognition.
arXiv Detail & Related papers (2026-01-07T13:15:23Z) - Embedded Universal Predictive Intelligence: a coherent framework for multi-agent learning [57.23345786304694]
We introduce a framework for prospective learning and embedded agency centered on self-prediction.<n>We show that in multi-agent settings, self-prediction enables agents to reason about others running similar algorithms.<n>We extend the theory of AIXI, and study universally intelligent embedded agents which start from a Solomonoff prior.
arXiv Detail & Related papers (2025-11-27T08:46:48Z) - A Step Toward World Models: A Survey on Robotic Manipulation [58.8419978790227]
We look at approaches that exhibit the core capabilities of world models through a review of methods in robotic manipulation.<n>We analyze their roles across perception, prediction, and control, identify key challenges and solutions, and distill the core components, capabilities, and functions that a fully realized world model should possess.
arXiv Detail & Related papers (2025-10-31T00:57:24Z) - Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents [0.0]
We propose an effective world model for decision-making that models the world's physics and its task semantics.<n>A systematic review of 2024 research in low-resource multi-agent soccer reveals a clear trend towards integrating symbolic and hierarchical methods.<n>We formalize this trend into a framework for Hierarchical Task Environments (HTEs), which are essential for bridging the gap between simple, reactive behaviors and sophisticated, strategic team play.
arXiv Detail & Related papers (2025-09-05T01:03:51Z) - SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model [88.04128601981145]
We introduce SimuRA, a goal-oriented architecture for generalized agentic reasoning.<n>modelname overcomes the limitations of autoregressive reasoning by introducing a world model for planning via simulation.<n>World-model-based planning, in particular, shows consistent advantage of up to 124% over autoregressive planning.
arXiv Detail & Related papers (2025-07-31T17:57:20Z) - Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective [54.77404771454794]
We develop a flexible and robust world model for Multi-Agent Reinforcement Learning (MARL) using diffusion models.<n>Our method, Diffusion-Inspired Multi-Agent world model (DIMA), achieves state-of-the-art performance across multiple multi-agent control benchmarks.
arXiv Detail & Related papers (2025-05-27T09:11:38Z) - AI in a vat: Fundamental limits of efficient world modelling for agent sandboxing and interpretability [84.52205243353761]
Recent work proposes using world models to generate controlled virtual environments in which AI agents can be tested before deployment.<n>We investigate ways of simplifying world models that remain agnostic to the AI agent under evaluation.
arXiv Detail & Related papers (2025-04-06T20:35:44Z) - AgentRM: Enhancing Agent Generalization with Reward Modeling [78.52623118224385]
We find that finetuning a reward model to guide the policy model is more robust than directly finetuning the policy model.<n>We propose AgentRM, a generalizable reward model, to guide the policy model for effective test-time search.
arXiv Detail & Related papers (2025-02-25T17:58:02Z) - Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [112.04307762405669]
G"odel Agent is a self-evolving framework inspired by the G"odel machine.<n>G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z) - Toward Universal and Interpretable World Models for Open-ended Learning Agents [0.0]
We introduce a generic, compositional and interpretable class of generative world models that supports open-ended learning agents.
This is a sparse class of Bayesian networks capable of approximating a broad range of processes, which provide agents with the ability to learn world models in a manner that may be both interpretable and computationally scalable.
arXiv Detail & Related papers (2024-09-27T12:03:15Z) - Mental Modeling of Reinforcement Learning Agents by Language Models [14.668006477454616]
This study empirically examines, for the first time, how well large language models can build a mental model of agents.
This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour.
arXiv Detail & Related papers (2024-06-26T17:14:45Z) - Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally
Inattentive Reinforcement Learning [85.86440477005523]
We study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model.
RIRL models the cost of cognitive information processing using mutual information.
We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions.
arXiv Detail & Related papers (2022-01-18T20:54:00Z) - Procedural Generalization by Planning with Self-Supervised World Models [10.119257232716834]
We measure the generalization ability of model-based agents in comparison to their model-free counterparts.
We identify three factors of procedural generalization -- planning, self-supervised representation learning, and procedural data diversity.
We find that these factors do not always provide the same benefits for the task generalization.
arXiv Detail & Related papers (2021-11-02T13:32:21Z) - A Consciousness-Inspired Planning Agent for Model-Based Reinforcement
Learning [104.3643447579578]
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state.
The design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
arXiv Detail & Related papers (2021-06-03T19:35:19Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.