Value-guided action planning with JEPA world models
- URL: http://arxiv.org/abs/2601.00844v1
- Date: Sun, 28 Dec 2025 20:17:49 GMT
- Title: Value-guided action planning with JEPA world models
- Authors: Matthieu Destrade, Oumayma Bounou, Quentin Le Lidec, Jean Ponce, Yann LeCun,
- Abstract summary: Building deep learning models that can reason about their environment requires capturing its underlying dynamics.<n>Joint-Embedded Predictive Architectures (JEPA) provide a promising framework to model such dynamics.<n>We propose an approach to enhance planning with JEPA world models by shaping their representation space.
- Score: 44.84158001773079
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Building deep learning models that can reason about their environment requires capturing its underlying dynamics. Joint-Embedded Predictive Architectures (JEPA) provide a promising framework to model such dynamics by learning representations and predictors through a self-supervised prediction objective. However, their ability to support effective action planning remains limited. We propose an approach to enhance planning with JEPA world models by shaping their representation space so that the negative goal-conditioned value function for a reaching cost in a given environment is approximated by a distance (or quasi-distance) between state embeddings. We introduce a practical method to enforce this constraint during training and show that it leads to significantly improved planning performance compared to standard JEPA models on simple control tasks.
Related papers
- Self-Supervised JEPA-based World Models for LiDAR Occupancy Completion and Forecasting [11.278785857643575]
We propose textbfAD-LiST-JEPA, a self-supervised world model for autonomous driving that predicts futuretemporal evolution from LiDAR data.<n>We evaluate the quality of the learned representations through a downstream-based occupancy completion and forecasting task.
arXiv Detail & Related papers (2026-02-13T02:42:21Z) - From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction [57.56072009935036]
We introduce a new driving paradigm named Policy World Model (PWM)<n>PWM integrates world modeling and trajectory planning within a unified architecture.<n>Our method matches or exceeds state-of-the-art approaches that rely on multi-view and multi-modal inputs.
arXiv Detail & Related papers (2025-10-22T14:57:51Z) - Reinforced Reasoning for Embodied Planning [18.40186665383579]
Embodied planning requires agents to make coherent multi-step decisions based on dynamic visual observations and natural language goals.<n>We introduce a reinforcement fine-tuning framework that brings R1-style reasoning enhancement into embodied planning.
arXiv Detail & Related papers (2025-05-28T07:21:37Z) - Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach [82.27842884709378]
We propose a framework that prioritizes natural language understanding and structured reasoning to enhance the agent's global understanding of the environment.<n>Our method outperforms previous approaches, particularly achieving a 44.4% relative improvement in task success rate.
arXiv Detail & Related papers (2025-05-22T09:08:47Z) - ACT-JEPA: Novel Joint-Embedding Predictive Architecture for Efficient Policy Representation Learning [90.41852663775086]
ACT-JEPA is a novel architecture that integrates imitation learning and self-supervised learning.<n>We train a policy to predict action sequences and abstract observation sequences.<n>Our experiments show that ACT-JEPA improves the quality of representations by learning temporal environment dynamics.
arXiv Detail & Related papers (2025-01-24T16:41:41Z) - Adaptive Planning with Generative Models under Uncertainty [20.922248169620783]
Planning with generative models has emerged as an effective decision-making paradigm across a wide range of domains.
While continuous replanning at each timestep might seem intuitive because it allows decisions to be made based on the most recent environmental observations, it results in substantial computational challenges.
Our work addresses this challenge by introducing a simple adaptive planning policy that leverages the generative model's ability to predict long-horizon state trajectories.
arXiv Detail & Related papers (2024-08-02T18:07:53Z) - Interactive Joint Planning for Autonomous Vehicles [19.479300967537675]
In interactive driving scenarios, the actions of one agent greatly influences those of its neighbors.
We present Interactive Joint Planning (IJP) that bridges MPC with learned prediction models.
IJP significantly outperforms the baselines that are either without joint optimization or running sampling-based planning.
arXiv Detail & Related papers (2023-10-27T17:48:25Z) - COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically
for Model-Based RL [50.385005413810084]
Dyna-style model-based reinforcement learning contains two phases: model rollouts to generate sample for policy learning and real environment exploration.
$textttCOPlanner$ is a planning-driven framework for model-based methods to address the inaccurately learned dynamics model problem.
arXiv Detail & Related papers (2023-10-11T06:10:07Z) - DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability [58.75803543245372]
Task and Motion Planning (TAMP) approaches are suited for planning multi-step autonomous robot manipulation.
We propose to overcome these limitations by composing diffusion models using a TAMP system.
We show how the combination of classical TAMP, generative modeling, and latent embedding enables multi-step constraint-based reasoning.
arXiv Detail & Related papers (2023-06-22T20:40:24Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.