Related papers: Compositional Planning with Jumpy World Models

Compositional Planning with Jumpy World Models

URL: http://arxiv.org/abs/2602.19634v1
Date: Mon, 23 Feb 2026 09:22:21 GMT
Title: Compositional Planning with Jumpy World Models
Authors: Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, Marc G. Bellemare, Alessandro Lazaric, Ahmed Touati,
Abstract summary: We study agents that compose pre-trained policies as temporally extended actions, enabling solutions to complex tasks that no constituent alone can solve.<n>Motivated by the geometric policy composition framework introduced in arXiv:2206.08736, we address these challenges by learning predictive models of multi-step dynamics.
Score: 70.74595987225908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to plan with temporal abstractions is central to intelligent decision-making. Rather than reasoning over primitive actions, we study agents that compose pre-trained policies as temporally extended actions, enabling solutions to complex tasks that no constituent alone can solve. Such compositional planning remains elusive as compounding errors in long-horizon predictions make it challenging to estimate the visitation distribution induced by sequencing policies. Motivated by the geometric policy composition framework introduced in arXiv:2206.08736, we address these challenges by learning predictive models of multi-step dynamics -- so-called jumpy world models -- that capture state occupancies induced by pre-trained policies across multiple timescales in an off-policy manner. Building on Temporal Difference Flows (arXiv:2503.09817), we enhance these models with a novel consistency objective that aligns predictions across timescales, improving long-horizon predictive accuracy. We further demonstrate how to combine these generative predictions to estimate the value of executing arbitrary sequences of policies over varying timescales. Empirically, we find that compositional planning with jumpy world models significantly improves zero-shot performance across a wide range of base policies on challenging manipulation and navigation tasks, yielding, on average, a 200% relative improvement over planning with primitive actions on long-horizon tasks.

Related papers

Closing the Train-Test Gap in World Models for Gradient-Based Planning [64.36544881136405]
We propose improved methods for training world models that enable efficient gradient-based planning.<n>At test time, our approach outperforms or matches the classical gradient-free cross-entropy method.
arXiv Detail & Related papers (2025-12-10T18:59:45Z)
Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models [45.523937630646394]
We propose SFP Forecasting as Planning (SFP), a new paradigm in Model Based Reinforcement Learning.<n>SFP constructs a novel World Model to simulate diverse high-temporal future states, enabling an "imagination-based" environmental simulation.
arXiv Detail & Related papers (2025-10-05T03:57:38Z)
Adaptive Conformal Prediction Intervals Over Trajectory Ensembles [50.31074512684758]
Future trajectories play an important role across domains such as autonomous driving, hurricane forecasting, and epidemic modeling.<n>We propose a unified framework based on conformal prediction that transforms sampled trajectories into calibrated prediction intervals with theoretical coverage guarantees.
arXiv Detail & Related papers (2025-08-18T21:14:07Z)
Next-Generation Conflict Forecasting: Unleashing Predictive Patterns through Spatiotemporal Learning [0.0]
This study presents a neural network architecture for forecasting three distinct types of violence up to 36 months in advance.<n>The model jointly performs probabilistic classification and regression tasks, producing both estimates and expected magnitudes of future events.<n>It is a promising tool for warning systems, humanitarian response planning, and evidence-based peacebuilding initiatives.
arXiv Detail & Related papers (2025-06-08T20:42:29Z)
Deep Active Inference Agents for Delayed and Long-Horizon Environments [1.693200946453174]
AIF agents rely on accurate immediate predictions and exhaustive planning, a limitation that is exacerbated in delayed environments.<n>We propose a generative-policy architecture featuring a multi-step latent transition that lets the generative model predict an entire horizon in a single look-ahead.<n>We evaluate our agent in an environment that mimics a realistic industrial scenario with delayed and long-horizon settings.
arXiv Detail & Related papers (2025-05-26T11:50:22Z)
Adaptive Planning with Generative Models under Uncertainty [20.922248169620783]
Planning with generative models has emerged as an effective decision-making paradigm across a wide range of domains. While continuous replanning at each timestep might seem intuitive because it allows decisions to be made based on the most recent environmental observations, it results in substantial computational challenges. Our work addresses this challenge by introducing a simple adaptive planning policy that leverages the generative model's ability to predict long-horizon state trajectories.
arXiv Detail & Related papers (2024-08-02T18:07:53Z)
Interactive Joint Planning for Autonomous Vehicles [19.479300967537675]
In interactive driving scenarios, the actions of one agent greatly influences those of its neighbors. We present Interactive Joint Planning (IJP) that bridges MPC with learned prediction models. IJP significantly outperforms the baselines that are either without joint optimization or running sampling-based planning.
arXiv Detail & Related papers (2023-10-27T17:48:25Z)
Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models. We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z)
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors [124.30562402952319]
The ability to predict and plan into the future is fundamental for agents acting in the world. Current learning approaches for visual prediction and planning fail on long-horizon tasks. We propose a framework for visual prediction and planning that is able to overcome both of these limitations.
arXiv Detail & Related papers (2020-06-23T17:58:56Z)
Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.