Aligning Agentic World Models via Knowledgeable Experience Learning
- URL: http://arxiv.org/abs/2601.13247v1
- Date: Mon, 19 Jan 2026 17:33:31 GMT
- Title: Aligning Agentic World Models via Knowledgeable Experience Learning
- Authors: Baochang Ren, Yunzhi Yao, Rui Sun, Shuofei Qiao, Ningyu Zhang, Huajun Chen,
- Abstract summary: We introduce WorldMind, a framework that constructs a symbolic World Knowledge Repository by synthesizing environmental feedback.<n>WorldMind achieves superior performance compared to baselines with remarkable cross-model and cross-environment transferability.
- Score: 68.85843641222186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current Large Language Models (LLMs) exhibit a critical modal disconnect: they possess vast semantic knowledge but lack the procedural grounding to respect the immutable laws of the physical world. Consequently, while these agents implicitly function as world models, their simulations often suffer from physical hallucinations-generating plans that are logically sound but physically unexecutable. Existing alignment strategies predominantly rely on resource-intensive training or fine-tuning, which attempt to compress dynamic environmental rules into static model parameters. However, such parametric encapsulation is inherently rigid, struggling to adapt to the open-ended variability of physical dynamics without continuous, costly retraining. To bridge this gap, we introduce WorldMind, a framework that autonomously constructs a symbolic World Knowledge Repository by synthesizing environmental feedback. Specifically, it unifies Process Experience to enforce physical feasibility via prediction errors and Goal Experience to guide task optimality through successful trajectories. Experiments on EB-ALFRED and EB-Habitat demonstrate that WorldMind achieves superior performance compared to baselines with remarkable cross-model and cross-environment transferability.
Related papers
- From Word to World: Can Large Language Models be Implicit Text-based World Models? [82.47317196099907]
Agentic reinforcement learning increasingly relies on experience-driven scaling.<n>World models offer a potential way to improve learning efficiency through simulated experience.<n>We study whether large language models can reliably serve this role and under what conditions they meaningfully benefit agents.
arXiv Detail & Related papers (2025-12-21T17:28:42Z) - Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems [38.4555621948915]
Prismatic World Model (PRISM-WM) is designed to decompose complex hybrid dynamics into composable primitives.<n>PRISM-WM significantly reduces rollout drift by accurately modeling sharp mode transitions in system dynamics.
arXiv Detail & Related papers (2025-12-09T09:40:34Z) - Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction [53.745458605360675]
We explore world-model internalization through efficient interaction and active reasoning (WMAct)<n>WMAct liberates the model from structured reasoning, allowing the model to shape thinking directly through its doing.<n>Our experiments on Sokoban, Maze, and Taxi show that WMAct yields effective world model reasoning capable of resolving tasks in a single turn.
arXiv Detail & Related papers (2025-11-28T18:59:47Z) - Object-Centric World Models for Causality-Aware Reinforcement Learning [13.063093054280946]
We propose emph Transformer Imagination with CAusality-aware reinforcement learning (ASTICA)<n>A unified framework in which object-centric Transformers serve as the world model and causality-aware policy and value networks.<n>Experiments on object-rich benchmarks demonstrate that STICA consistently outperforms state-of-the-art agents in both sample efficiency and final performance.
arXiv Detail & Related papers (2025-11-18T08:53:09Z) - One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration [77.8436947454471]
Symbolic world modeling requires inferring and representing an environment's transitional dynamics as an executable program.<n>OneLife is a framework that models world dynamics through conditionally-activated programmatic laws.<n>OneLife can successfully learn key environment dynamics from minimal, unguided interaction.
arXiv Detail & Related papers (2025-10-14T02:49:32Z) - Dynamics-Aligned Latent Imagination in Contextual World Models for Zero-Shot Generalization [1.6332728502735252]
We introduce Dynamics-Aligned Latent Imagination (DALI), a framework that infers latent context representations from agent-environment interactions.<n>DALI generates actionable representations conditioning the world model and policy, bridging perception and control.<n>On challenging cMDP benchmarks, DALI achieves significant gains over context-unaware baselines.
arXiv Detail & Related papers (2025-08-27T22:02:56Z) - Revealing the Challenges of Sim-to-Real Transfer in Model-Based Reinforcement Learning via Latent Space Modeling [31.74241286023207]
Reinforcement learning (RL) is playing an increasingly important role in fields such as robotic control and autonomous driving.<n>The gap between simulation and the real environment remains a major obstacle to the practical deployment of RL.<n>We propose a latent space based approach to analyze the impact of simulation on real-world policy improvement.
arXiv Detail & Related papers (2025-06-15T06:02:42Z) - Adapting World Models with Latent-State Dynamics Residuals [10.892848566977369]
ReDRAW is a latent-state autoregressive world model pretrained in simulation and calibrated to target environments.<n>It enables RL agents to be optimized with imagined rollouts under corrected dynamics and then deployed in the real world.
arXiv Detail & Related papers (2025-04-03T03:41:30Z) - Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning [65.85335291827086]
This paper tries to learn and understand underlying semantic variations from distracting videos via offline-to-online latent distillation and flexible disentanglement constraints.<n>We pretrain the action-free video prediction model offline with disentanglement regularization to extract semantic knowledge from distracting videos.<n>For finetuning in the online environment, we exploit the knowledge from the pretrained model and introduce a disentanglement constraint to the world model.
arXiv Detail & Related papers (2025-03-11T13:50:22Z) - Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
This work advances model-based reinforcement learning by addressing the challenges of long-horizon prediction, error accumulation, and sim-to-real transfer.<n>By providing a scalable and robust framework, the introduced methods pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.