Web World Models
- URL: http://arxiv.org/abs/2512.23676v1
- Date: Mon, 29 Dec 2025 18:31:45 GMT
- Title: Web World Models
- Authors: Jichen Feng, Yifan Zhang, Chenggong Zhang, Yifu Lu, Shilong Liu, Mengdi Wang,
- Abstract summary: We introduce the Web World Model (WWM), a middle ground where world state and physics'' are implemented in ordinary web code.<n>We build a suite of WWMs on a realistic web stack, including an infinite travel atlas grounded in real geography, fictional galaxy explorers, web-scale encyclopedic and narrative worlds, and simulation- and game-like environments.<n>Our results suggest that web stacks themselves can serve as a scalable substrate for world models, enabling controllable yet open-ended environments.
- Score: 60.208836336654315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language agents increasingly require persistent worlds in which they can act, remember, and learn. Existing approaches sit at two extremes: conventional web frameworks provide reliable but fixed contexts backed by databases, while fully generative world models aim for unlimited environments at the expense of controllability and practical engineering. In this work, we introduce the Web World Model (WWM), a middle ground where world state and ``physics'' are implemented in ordinary web code to ensure logical consistency, while large language models generate context, narratives, and high-level decisions on top of this structured latent state. We build a suite of WWMs on a realistic web stack, including an infinite travel atlas grounded in real geography, fictional galaxy explorers, web-scale encyclopedic and narrative worlds, and simulation- and game-like environments. Across these systems, we identify practical design principles for WWMs: separating code-defined rules from model-driven imagination, representing latent state as typed web interfaces, and utilizing deterministic generation to achieve unlimited but structured exploration. Our results suggest that web stacks themselves can serve as a scalable substrate for world models, enabling controllable yet open-ended environments. Project Page: https://github.com/Princeton-AI2-Lab/Web-World-Models.
Related papers
- WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World [100.68103378427567]
Generative world models are reshaping embodied AI, enabling agents to synthesize realistic 4D driving environments that look convincing but often fail physically or behaviorally.<n>We introduce WorldLens, a full-spectrum benchmark evaluating how well a model builds, understands, and behaves within its generated world.<n>We further construct WorldLens-26K, a large-scale dataset of human-annotated videos with numerical scores and textual rationales, and develop WorldLens-Agent.
arXiv Detail & Related papers (2025-12-11T18:59:58Z) - Affordance Representation and Recognition for Autonomous Agents [64.39018305018904]
This paper introduces a pattern language for world modeling from structured data.<n>The DOM Transduction Pattern addresses the challenge of web page complexity.<n>The Hypermedia Affordances Recognition Pattern enables the agent to dynamically enrich its world model.
arXiv Detail & Related papers (2025-10-28T14:27:28Z) - PoE-World: Compositional World Modeling with Products of Programmatic Experts [50.35012247866856]
Learning how the world works is central to building AI agents that can adapt to complex environments.<n>Recent advances in program synthesis using Large Language Models (LLMs) give an alternate approach which learns world models represented as source code.<n>We show that this approach can learn complex world models from just a few observations. We evaluate the learned world models by embedding them in a model-based planning agent, demonstrating efficient performance and generalization to unseen levels on Atari's Pong and Montezuma's Revenge.
arXiv Detail & Related papers (2025-05-16T03:28:42Z) - Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents [22.608219492706876]
We propose a model-based planning framework for web agents that employs a world model to simulate and deliberate over the outcome of each candidate action before committing to one.<n> Empirical results demonstrate that WebDreamer achieves substantial performance improvements over reactive baselines.<n>Our trained world model, Dreamer-7B, performs comparable to GPT-4o, highlighting the potential of specialized world models for efficient and effective planning in complex web environments.
arXiv Detail & Related papers (2024-11-10T18:50:51Z) - One-shot World Models Using a Transformer Trained on a Synthetic Prior [37.027893127637036]
One-Shot World Model (OSWM) is a transformer world model that is learned in an in-context learning fashion from purely synthetic data.
OSWM is able to quickly adapt to the dynamics of a simple grid world, as well as the CartPole gym and a custom control environment.
arXiv Detail & Related papers (2024-09-21T09:39:32Z) - WorldGPT: Empowering LLM as Multimodal World Model [51.243464216500975]
We introduce WorldGPT, a generalist world model built upon Multimodal Large Language Model (MLLM)
WorldGPT acquires an understanding of world dynamics through analyzing millions of videos across various domains.
We conduct evaluations on WorldNet, a multimodal state transition prediction benchmark.
arXiv Detail & Related papers (2024-04-28T14:42:02Z) - Grounded Decoding: Guiding Text Generation with Grounded Models for
Embodied Agents [111.15288256221764]
Grounded-decoding project aims to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models.
We frame this as a problem similar to probabilistic filtering: decode a sequence that both has high probability under the language model and high probability under a set of grounded model objectives.
We demonstrate how such grounded models can be obtained across three simulation and real-world domains, and that the proposed decoding strategy is able to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models.
arXiv Detail & Related papers (2023-03-01T22:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.