Learning Local Causal World Models with State Space Models and Attention
- URL: http://arxiv.org/abs/2505.02074v1
- Date: Sun, 04 May 2025 11:57:02 GMT
- Title: Learning Local Causal World Models with State Space Models and Attention
- Authors: Francesco Petri, Luigi Asprino, Aldo Gangemi,
- Abstract summary: We show that a SSM can model the dynamics of a simple environment and learn a causal model at the same time.<n>We pave the way for further experiments that lean into the strength of SSMs and further enhance them with causal awareness.
- Score: 1.5498250598583487
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: World modelling, i.e. building a representation of the rules that govern the world so as to predict its evolution, is an essential ability for any agent interacting with the physical world. Despite their impressive performance, many solutions fail to learn a causal representation of the environment they are trying to model, which would be necessary to gain a deep enough understanding of the world to perform complex tasks. With this work, we aim to broaden the research in the intersection of causality theory and neural world modelling by assessing the potential for causal discovery of the State Space Model (SSM) architecture, which has been shown to have several advantages over the widespread Transformer. We show empirically that, compared to an equivalent Transformer, a SSM can model the dynamics of a simple environment and learn a causal model at the same time with equivalent or better performance, thus paving the way for further experiments that lean into the strength of SSMs and further enhance them with causal awareness.
Related papers
- The Trinity of Consistency as a Defining Principle for General World Models [106.16462830681452]
General World Models are capable of learning, simulating, and reasoning about objective physical laws.<n>We propose a principled theoretical framework that defines the essential properties requisite for a General World Model.<n>Our work establishes a principled pathway toward general world models, clarifying both the limitations of current systems and the architectural requirements for future progress.
arXiv Detail & Related papers (2026-02-26T16:15:55Z) - Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models [60.543714835980325]
Humans construct internal world models and reason by manipulating the concepts within these models.<n>Recent advances in AI approximate such human cognitive abilities, where world models are believed to be embedded within large language models.<n>This paper presents the first principled study of when and how visual generation benefits reasoning.
arXiv Detail & Related papers (2026-01-27T17:40:07Z) - Aligning Agentic World Models via Knowledgeable Experience Learning [68.85843641222186]
We introduce WorldMind, a framework that constructs a symbolic World Knowledge Repository by synthesizing environmental feedback.<n>WorldMind achieves superior performance compared to baselines with remarkable cross-model and cross-environment transferability.
arXiv Detail & Related papers (2026-01-19T17:33:31Z) - From Word to World: Can Large Language Models be Implicit Text-based World Models? [82.47317196099907]
Agentic reinforcement learning increasingly relies on experience-driven scaling.<n>World models offer a potential way to improve learning efficiency through simulated experience.<n>We study whether large language models can reliably serve this role and under what conditions they meaningfully benefit agents.
arXiv Detail & Related papers (2025-12-21T17:28:42Z) - Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction [53.745458605360675]
We explore world-model internalization through efficient interaction and active reasoning (WMAct)<n>WMAct liberates the model from structured reasoning, allowing the model to shape thinking directly through its doing.<n>Our experiments on Sokoban, Maze, and Taxi show that WMAct yields effective world model reasoning capable of resolving tasks in a single turn.
arXiv Detail & Related papers (2025-11-28T18:59:47Z) - Zero-shot World Models via Search in Memory [7.15414423703749]
We leverage similarity search and representations to approximate a world model without a training procedure.<n>We evaluate the models on the quality of latent reconstruction and on the perceived similarity of the reconstructed image.<n>Our model show stronger performance in long-horizon prediction with respect to the baseline on a range of visually different environments.
arXiv Detail & Related papers (2025-10-17T18:09:36Z) - Can World Models Benefit VLMs for World Dynamics? [59.73433292793044]
We investigate the capabilities when world model priors are transferred into Vision-Language Models.<n>We name our best-performing variant Dynamic Vision Aligner (DyVA)<n>We find DyVA to surpass both open-source and proprietary baselines, achieving state-of-the-art or comparable performance.
arXiv Detail & Related papers (2025-10-01T13:07:05Z) - World Models for Cognitive Agents: Transforming Edge Intelligence in Future Networks [55.90051810762702]
We present a comprehensive overview of world models, highlighting their architecture, training paradigms, and applications across prediction, generation, planning, and causal reasoning.<n>We propose Wireless Dreamer, a novel world model-based reinforcement learning framework tailored for wireless edge intelligence optimization.
arXiv Detail & Related papers (2025-05-31T06:43:00Z) - Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning [11.762260966376125]
A motion dynamic model is essential for efficient skill acquisition and effective planning.<n>We introduce the neural motion simulator (MoSim), a world model that predicts the future physical state of an embodied system.<n>MoSim achieves state-of-the-art performance in physical state prediction.
arXiv Detail & Related papers (2025-04-09T17:59:32Z) - AdaWorld: Learning Adaptable World Models with Latent Actions [76.50869178593733]
We propose AdaWorld, an innovative world model learning approach that enables efficient adaptation.<n>Key idea is to incorporate action information during the pretraining of world models.<n>We then develop an autoregressive world model that conditions on these latent actions.
arXiv Detail & Related papers (2025-03-24T17:58:15Z) - Multimodal Dreaming: A Global Workspace Approach to World Model-Based Reinforcement Learning [2.5749046466046903]
In Reinforcement Learning (RL), world models aim to capture how the environment evolves in response to the agent's actions.<n>We show that performing the dreaming process inside the latent space allows for training with fewer environment steps.<n>We conclude that the combination of GW with World Models holds great potential for improving decision-making in RL agents.
arXiv Detail & Related papers (2025-02-28T15:24:17Z) - Making Large Language Models into World Models with Precondition and Effect Knowledge [1.8561812622368763]
We show that Large Language Models (LLMs) can be induced to perform two critical world model functions.
We validate that the precondition and effect knowledge generated by our models aligns with human understanding of world dynamics.
arXiv Detail & Related papers (2024-09-18T19:28:04Z) - Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language Models [51.891804790725686]
Elements of World Knowledge (EWoK) is a framework for evaluating language models' understanding of conceptual knowledge underlying world modeling.<n>EWoK-core-1.0 is a dataset of 4,374 items covering 11 world knowledge domains.<n>All tested models perform worse than humans, with results varying drastically across domains.
arXiv Detail & Related papers (2024-05-15T17:19:42Z) - Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond [101.15395503285804]
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI)
In this survey, we embark on a comprehensive exploration of the latest advancements in world models.
We examine challenges and limitations of world models, and discuss their potential future directions.
arXiv Detail & Related papers (2024-05-06T14:37:07Z) - Learning World Models With Hierarchical Temporal Abstractions: A Probabilistic Perspective [2.61072980439312]
Devising formalisms to develop internal world models is a critical research challenge in the domains of artificial intelligence and machine learning.
This thesis identifies several limitations with the prevalent use of state space models as internal world models.
The structure of models in formalisms facilitates exact probabilistic inference using belief propagation, as well as end-to-end learning via backpropagation through time.
These formalisms integrate the concept of uncertainty in world states, thus improving the system's capacity to emulate the nature of the real world and quantify the confidence in its predictions.
arXiv Detail & Related papers (2024-04-24T12:41:04Z) - The Essential Role of Causality in Foundation World Models for Embodied AI [102.75402420915965]
Embodied AI agents will require the ability to perform new tasks in many different real-world environments.
Current foundation models fail to accurately model physical interactions and are therefore insufficient for Embodied AI.
The study of causality lends itself to the construction of veridical world models.
arXiv Detail & Related papers (2024-02-06T17:15:33Z) - HarmonyDream: Task Harmonization Inside World Models [93.07314830304193]
Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning.
We propose a simple yet effective approach, HarmonyDream, which automatically adjusts loss coefficients to maintain task harmonization.
arXiv Detail & Related papers (2023-09-30T11:38:13Z) - Causal World Models by Unsupervised Deconfounding of Physical Dynamics [20.447000858907646]
The capability of imagining internally with a mental model of the world is vitally important for human cognition.
We propose Causal World Models (CWMs) that allow unsupervised modeling of relationships between the intervened and alternative futures.
We show reductions in complexity sample for reinforcement learning tasks and improvements in counterfactual physical reasoning.
arXiv Detail & Related papers (2020-12-28T13:44:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.