Related papers: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

URL: http://arxiv.org/abs/2512.07745v1
Date: Mon, 08 Dec 2025 17:29:52 GMT
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Authors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang Wang,
Abstract summary: Generative diffusion models for end-to-end autonomous driving often suffer from mode collapse.<n>We propose DiffusionDriveV2, which leverages reinforcement learning to constrain low-quality modes and explore for superior trajectories.<n>This significantly enhances the overall output quality while preserving the inherent multimodality of its core Gaussian Mixture Model.
Score: 65.7087560656003
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative diffusion models for end-to-end autonomous driving often suffer from mode collapse, tending to generate conservative and homogeneous behaviors. While DiffusionDrive employs predefined anchors representing different driving intentions to partition the action space and generate diverse trajectories, its reliance on imitation learning lacks sufficient constraints, resulting in a dilemma between diversity and consistent high quality. In this work, we propose DiffusionDriveV2, which leverages reinforcement learning to both constrain low-quality modes and explore for superior trajectories. This significantly enhances the overall output quality while preserving the inherent multimodality of its core Gaussian Mixture Model. First, we use scale-adaptive multiplicative noise, ideal for trajectory planning, to promote broad exploration. Second, we employ intra-anchor GRPO to manage advantage estimation among samples generated from a single anchor, and inter-anchor truncated GRPO to incorporate a global perspective across different anchors, preventing improper advantage comparisons between distinct intentions (e.g., turning vs. going straight), which can lead to further mode collapse. DiffusionDriveV2 achieves 91.2 PDMS on the NAVSIM v1 dataset and 85.5 EPDMS on the NAVSIM v2 dataset in closed-loop evaluation with an aligned ResNet-34 backbone, setting a new record. Further experiments validate that our approach resolves the dilemma between diversity and consistent high quality for truncated diffusion models, achieving the best trade-off. Code and model will be available at https://github.com/hustvl/DiffusionDriveV2

Related papers

MeanFuser: Fast One-Step Multi-Modal Trajectory Generation and Adaptive Reconstruction via MeanFlow for End-to-End Autonomous Driving [23.013043338076745]
MeanFuser is an end-to-end autonomous driving method.<n>We introduce GMN to guide generative sampling and adapt MeanFlow Identity" to end-to-end planning.<n>Experiments on the NAVSIM closed-loop benchmark demonstrate that MeanFuser achieves outstanding performance without the supervision of the PDM Score.
arXiv Detail & Related papers (2026-02-23T17:17:26Z)
WAM-Diff: A Masked Diffusion VLA Framework with MoE and Online Reinforcement Learning for Autonomous Driving [9.719456684859606]
WAM-Diff is a framework that employs masked diffusion to refine a discrete sequence representing future ego-trajectories.<n>Our model achieves 91.0 PDMS on NAVSIM-v1 and 89.7S on NAVSIM-v2, demonstrating the effectiveness of masked diffusion for autonomous driving.
arXiv Detail & Related papers (2025-12-06T10:51:53Z)
DM$^3$T: Harmonizing Modalities via Diffusion for Multi-Object Tracking [10.270441242480482]
This paper proposes DM$3$T, a novel framework that reformulates multimodal fusion as an iterative feature alignment process.<n>Our approach performs iterative cross-modal harmonization through a proposed Cross-Modal Diffusion Fusion (C-MDF) module.<n>To further improve tracking robustness, we design a Hierarchical Tracker that adaptively handles confidence estimation.
arXiv Detail & Related papers (2025-11-28T06:02:58Z)
Which Layer Causes Distribution Deviation? Entropy-Guided Adaptive Pruning for Diffusion and Flow Models [77.55829017952728]
EntPruner is an entropy-guided automatic progressive pruning framework for diffusion and flow models.<n>Experiments on DiT and SiT models demonstrate the effectiveness of EntPruner, achieving up to 2.22$times$ inference speedup.
arXiv Detail & Related papers (2025-11-26T07:20:48Z)
Dual-Stream Diffusion for World-Model Augmented Vision-Language-Action Model [62.889356203346985]
We propose DUal-STream diffusion (DUST), a world-model augmented VLA framework that handles the modality conflict.<n>DUST achieves up to 6% gains over a standard VLA baseline and implicit world-modeling methods.<n>On real-world tasks with the Franka Research 3, DUST outperforms baselines in success rate by 13%.
arXiv Detail & Related papers (2025-10-31T16:32:12Z)
FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [57.577843653775]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z)
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving [38.867860153968394]
Diffusion model has emerged as a powerful generative technique for robotic policy learning.<n>We propose a novel truncated diffusion policy that incorporates prior multi-mode anchors and truncates the diffusion schedule.<n>The proposed model, DiffusionDrive, demonstrates 10$times$ reduction in denoising steps compared to vanilla diffusion policy.
arXiv Detail & Related papers (2024-11-22T18:59:47Z)
Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion. In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z)
Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents. We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings. We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.