Related papers: DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving

DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving

URL: http://arxiv.org/abs/2511.17150v1
Date: Fri, 21 Nov 2025 11:16:00 GMT
Title: DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving
Authors: Liuhan Yin, Runkun Ju, Guodong Guo, Erkang Cheng,
Abstract summary: We propose DiffRefiner, a novel two-stage trajectory prediction framework.<n>The first stage uses a transformer-based Proposal Decoder to generate coarse trajectory predictions.<n>The second stage applies a Diffusion Refiner that iteratively denoises and refines these initial predictions.
Score: 28.22372133560876
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unlike discriminative approaches in autonomous driving that predict a fixed set of candidate trajectories of the ego vehicle, generative methods, such as diffusion models, learn the underlying distribution of future motion, enabling more flexible trajectory prediction. However, since these methods typically rely on denoising human-crafted trajectory anchors or random noise, there remains significant room for improvement. In this paper, we propose DiffRefiner, a novel two-stage trajectory prediction framework. The first stage uses a transformer-based Proposal Decoder to generate coarse trajectory predictions by regressing from sensor inputs using predefined trajectory anchors. The second stage applies a Diffusion Refiner that iteratively denoises and refines these initial predictions. In this way, we enhance the performance of diffusion-based planning by incorporating a discriminative trajectory proposal module, which provides strong guidance for the generative refinement process. Furthermore, we design a fine-grained denoising decoder to enhance scene compliance, enabling more accurate trajectory prediction through enhanced alignment with the surrounding environment. Experimental results demonstrate that DiffRefiner achieves state-of-the-art performance, attaining 87.4 EPDMS on NAVSIM v2, and 87.1 DS along with 71.4 SR on Bench2Drive, thereby setting new records on both public benchmarks. The effectiveness of each component is validated via ablation studies as well.

Related papers

TrajDiff: End-to-end Autonomous Driving without Perception Annotation [65.49718343700319]
End-to-end autonomous driving systems directly generate driving policies from raw sensor inputs.<n>TrajDiff is a Trajectory-oriented BEV Conditioned Diffusion framework that establishes a perception annotation-free generative method for end-to-end autonomous driving.<n> evaluated on the NAVSIM benchmark, TrajDiff achieves 87.5 PDMS, establishing state-of-the-art performance among all annotation-free methods.
arXiv Detail & Related papers (2025-11-30T04:34:20Z)
ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving [64.42138266293202]
ResAD is a Normalized Residual Trajectory Modeling framework.<n>It reframes the learning task to predict the residual deviation from an inertial reference.<n>On the NAVSIM benchmark, ResAD achieves a state-of-the-art PDMS of 88.6 using a vanilla diffusion policy.
arXiv Detail & Related papers (2025-10-09T17:59:36Z)
Diffusion^2: Dual Diffusion Model with Uncertainty-Aware Adaptive Noise for Momentary Trajectory Prediction [18.85021503551474]
Earlier studies primarily utilized sufficient observational data to predict future trajectories.<n>In real-world scenarios, such as pedestrians suddenly emerging from blind spots, sufficient observational data is often unavailable.<n>We propose a novel framework termed Diffusion2, tailored for momentary trajectory prediction.
arXiv Detail & Related papers (2025-10-05T21:19:33Z)
Adaptive Conformal Prediction Intervals Over Trajectory Ensembles [50.31074512684758]
Future trajectories play an important role across domains such as autonomous driving, hurricane forecasting, and epidemic modeling.<n>We propose a unified framework based on conformal prediction that transforms sampled trajectories into calibrated prediction intervals with theoretical coverage guarantees.
arXiv Detail & Related papers (2025-08-18T21:14:07Z)
Intention-Aware Diffusion Model for Pedestrian Trajectory Prediction [15.151965172049271]
We propose a diffusion-based pedestrian trajectory prediction framework that incorporates both short-term and long-term motion intentions.<n>The proposed framework is evaluated on the widely used ETH, UCY, and SDD benchmarks, demonstrating competitive results against state-of-the-art methods.
arXiv Detail & Related papers (2025-08-10T02:36:33Z)
Foresight in Motion: Reinforcing Trajectory Prediction with Reward Heuristics [34.570579623171476]
"First Reasoning, Then Forecasting" is a strategy that explicitly incorporates behavior intentions as spatial guidance for trajectory prediction.<n>We introduce an interpretable, reward-driven intention reasoner grounded in a novel query-centric Inverse Reinforcement Learning scheme.<n>Our approach significantly enhances trajectory prediction confidence, achieving highly competitive performance relative to state-of-the-art methods.
arXiv Detail & Related papers (2025-07-16T09:46:17Z)
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation [49.49868273653921]
Diffusion models are promising for joint trajectory prediction and controllable generation in autonomous driving. We introduce Optimal Gaussian Diffusion (OGD) and Estimated Clean Manifold (ECM) Guidance. Our methodology streamlines the generative process, enabling practical applications with reduced computational overhead.
arXiv Detail & Related papers (2024-08-01T17:59:59Z)
GDTS: Goal-Guided Diffusion Model with Tree Sampling for Multi-Modal Pedestrian Trajectory Prediction [15.731398013255179]
We propose a novel Goal-Guided Diffusion Model with Tree Sampling for multi-modal trajectory prediction.<n>A two-stage tree sampling algorithm is presented, which leverages common features to reduce the inference time and improve accuracy for multi-modal prediction.<n> Experimental results demonstrate that our proposed framework achieves comparable state-of-the-art performance with real-time inference speed in public datasets.
arXiv Detail & Related papers (2023-11-25T03:55:06Z)
Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting. We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them. We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z)
Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion [88.45326906116165]
We present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID) We encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories. Experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-03-25T16:59:08Z)
Trajectory Forecasting from Detection with Uncertainty-Aware Motion Encoding [121.66374635092097]
Trajectories obtained from object detection and tracking are inevitably noisy. We propose a trajectory predictor directly based on detection results without relying on explicitly formed trajectories.
arXiv Detail & Related papers (2022-02-03T09:09:56Z)
BiTraP: Bi-directional Pedestrian Trajectory Prediction with Multi-modal Goal Estimation [28.10445924083422]
BiTraP is a goal-conditioned bi-directional multi-modal trajectory prediction method based on the CVAE. BiTraP generalizes to both first-person view (FPV) and bird's-eye view (BEV) scenarios and outperforms state-of-the-art results by 10-50%.
arXiv Detail & Related papers (2020-07-29T02:40:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.