Map-World: Masked Action planning and Path-Integral World Model for Autonomous Driving
- URL: http://arxiv.org/abs/2511.20156v1
- Date: Tue, 25 Nov 2025 10:30:26 GMT
- Title: Map-World: Masked Action planning and Path-Integral World Model for Autonomous Driving
- Authors: Bin Hu, Zijian Lu, Haicheng Liao, Chengran Yuan, Bin Rao, Yongkang Li, Guofa Li, Zhiyong Cui, Cheng-zhong Xu, Zhenning Li,
- Abstract summary: Masked Action Planning (MAP) module treats future ego motion as masked sequence completion.<n>A lightweight world model rolls out future BEV semantics conditioned on each candidate trajectory.<n>On NAVSIM, our method matches anchor-based approaches and achieves state-of-the-art performance.
- Score: 35.521279875146526
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motion planning for autonomous driving must handle multiple plausible futures while remaining computationally efficient. Recent end-to-end systems and world-model-based planners predict rich multi-modal trajectories, but typically rely on handcrafted anchors or reinforcement learning to select a single best mode for training and control. This selection discards information about alternative futures and complicates optimization. We propose MAP-World, a prior-free multi-modal planning framework that couples masked action planning with a path-weighted world model. The Masked Action Planning (MAP) module treats future ego motion as masked sequence completion: past waypoints are encoded as visible tokens, future waypoints are represented as mask tokens, and a driving-intent path provides a coarse scaffold. A compact latent planning state is expanded into multiple trajectory queries with injected noise, yielding diverse, temporally consistent modes without anchor libraries or teacher policies. A lightweight world model then rolls out future BEV semantics conditioned on each candidate trajectory. During training, semantic losses are computed as an expectation over modes, using trajectory probabilities as discrete path weights, so the planner learns from the full distribution of plausible futures instead of a single selected path. On NAVSIM, our method matches anchor-based approaches and achieves state-of-the-art performance among world-model-based methods, while avoiding reinforcement learning and maintaining real-time inference latency.
Related papers
- FutureX: Enhance End-to-End Autonomous Driving via Latent Chain-of-Thought World Model [103.2513470454204]
FutureX is a pipeline that enhances end-to-end planners to perform complex motion planning via future scene latent reasoning and trajectory refinement.<n>FutureX enhances existing methods by producing more rational motion plans and fewer collisions without compromising efficiency.
arXiv Detail & Related papers (2025-12-12T02:12:49Z) - Closing the Train-Test Gap in World Models for Gradient-Based Planning [64.36544881136405]
We propose improved methods for training world models that enable efficient gradient-based planning.<n>At test time, our approach outperforms or matches the classical gradient-free cross-entropy method.
arXiv Detail & Related papers (2025-12-10T18:59:45Z) - Autonomous Vehicle Path Planning by Searching With Differentiable Simulation [55.46735086899153]
Planning allows an agent to safely refine its actions before executing them in the real world.<n>In autonomous driving, this is crucial to avoid collisions and navigate in complex, dense traffic scenarios.<n>Here we propose Differentiable Simulation for Search (DSS), a framework that leverages the differentiable simulator Waymax as both a next state predictor and a critic.
arXiv Detail & Related papers (2025-11-14T07:56:34Z) - Bootstrap Off-policy with World Model [59.129118672069644]
We propose BOOM, a framework that tightly integrates planning and off-policy learning through a bootstrap loop.<n>BOOM achieves state-of-the-art results in both training stability and final performance.
arXiv Detail & Related papers (2025-11-01T06:33:04Z) - From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction [57.56072009935036]
We introduce a new driving paradigm named Policy World Model (PWM)<n>PWM integrates world modeling and trajectory planning within a unified architecture.<n>Our method matches or exceeds state-of-the-art approaches that rely on multi-view and multi-modal inputs.
arXiv Detail & Related papers (2025-10-22T14:57:51Z) - Autoregressive End-to-End Planning with Time-Invariant Spatial Alignment and Multi-Objective Policy Refinement [15.002921311530374]
Autoregressive models are a formidable baseline for end-to-end planning in autonomous driving.<n>Their performance is constrained by atemporal misalignment, as the planner must condition future actions on past sensory data.<n>We propose a Time-Invariant Alignment (TISA) module that learns to project initial environmental features into a consistent ego-centric frame.<n>We also introduce a multi-objective post-training stage using Direct Preference Optimization (DPO) to move beyond pure imitation.
arXiv Detail & Related papers (2025-09-25T09:24:45Z) - Drive As You Like: Strategy-Level Motion Planning Based on A Multi-Head Diffusion Model [7.3078271605135114]
We propose a diffusion-based multi-head trajectory planner(M-diffusion planner)<n>During the early training stage, all output heads share weights to learn to generate high-quality trajectories.<n>We incorporate a large language model (LLM) to guide strategy selection, enabling dynamic, instruction-aware planning.
arXiv Detail & Related papers (2025-08-23T08:33:11Z) - Predictive Planner for Autonomous Driving with Consistency Models [5.966385886363771]
Trajectory prediction and planning are essential for autonomous vehicles to navigate safely and efficiently in dynamic environments.<n>Recent diffusion-based generative models have shown promise in multi-agent trajectory generation, but their slow sampling is less suitable for high-frequency planning tasks.<n>We leverage the consistency model to build a predictive planner that samples from a joint distribution of ego and surrounding agents, conditioned on the ego vehicle's navigational goal.
arXiv Detail & Related papers (2025-02-12T00:26:01Z) - DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive.
SparseDrive consists of a symmetric sparse perception module and a parallel motion planner.
For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - Integration of Reinforcement Learning Based Behavior Planning With
Sampling Based Motion Planning for Automated Driving [0.5801044612920815]
We propose a method to employ a trained deep reinforcement learning policy for dedicated high-level behavior planning.
To the best of our knowledge, this work is the first to apply deep reinforcement learning in this manner.
arXiv Detail & Related papers (2023-04-17T13:49:55Z) - Deep Interactive Motion Prediction and Planning: Playing Games with
Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model.
Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.