Related papers: TrajDiff: End-to-end Autonomous Driving without Perception Annotation

TrajDiff: End-to-end Autonomous Driving without Perception Annotation

URL: http://arxiv.org/abs/2512.00723v1
Date: Sun, 30 Nov 2025 04:34:20 GMT
Title: TrajDiff: End-to-end Autonomous Driving without Perception Annotation
Authors: Xingtai Gui, Jianbo Zhao, Wencheng Han, Jikai Wang, Jiahao Gong, Feiyang Tan, Cheng-zhong Xu, Jianbing Shen,
Abstract summary: End-to-end autonomous driving systems directly generate driving policies from raw sensor inputs.<n>TrajDiff is a Trajectory-oriented BEV Conditioned Diffusion framework that establishes a perception annotation-free generative method for end-to-end autonomous driving.<n> evaluated on the NAVSIM benchmark, TrajDiff achieves 87.5 PDMS, establishing state-of-the-art performance among all annotation-free methods.
Score: 65.49718343700319
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: End-to-end autonomous driving systems directly generate driving policies from raw sensor inputs. While these systems can extract effective environmental features for planning, relying on auxiliary perception tasks, developing perception annotation-free planning paradigms has become increasingly critical due to the high cost of manual perception annotation. In this work, we propose TrajDiff, a Trajectory-oriented BEV Conditioned Diffusion framework that establishes a fully perception annotation-free generative method for end-to-end autonomous driving. TrajDiff requires only raw sensor inputs and future trajectory, constructing Gaussian BEV heatmap targets that inherently capture driving modalities. We design a simple yet effective trajectory-oriented BEV encoder to extract the TrajBEV feature without perceptual supervision. Furthermore, we introduce Trajectory-oriented BEV Diffusion Transformer (TB-DiT), which leverages ego-state information and the predicted TrajBEV features to directly generate diverse yet plausible trajectories, eliminating the need for handcrafted motion priors. Beyond architectural innovations, TrajDiff enables exploration of data scaling benefits in the annotation-free setting. Evaluated on the NAVSIM benchmark, TrajDiff achieves 87.5 PDMS, establishing state-of-the-art performance among all annotation-free methods. With data scaling, it further improves to 88.5 PDMS, which is comparable to advanced perception-based approaches. Our code and model will be made publicly available.

Related papers

DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving [28.22372133560876]
We propose DiffRefiner, a novel two-stage trajectory prediction framework.<n>The first stage uses a transformer-based Proposal Decoder to generate coarse trajectory predictions.<n>The second stage applies a Diffusion Refiner that iteratively denoises and refines these initial predictions.
arXiv Detail & Related papers (2025-11-21T11:16:00Z)
FlowDrive: Energy Flow Field for End-to-End Autonomous Driving [50.89871153094958]
FlowDrive is a novel framework that introduces physically interpretable energy-based flow fields to encode semantic priors and safety cues into the BEV space.<n> Experiments on the NAVSIM v2 benchmark demonstrate that FlowDrive achieves state-of-the-art performance with anS of 86.3, surpassing prior baselines in both safety and planning quality.
arXiv Detail & Related papers (2025-09-17T13:51:33Z)
End-to-End Driving with Online Trajectory Evaluation via BEV World Model [52.10633338584164]
We propose an end-to-end driving framework WoTE, which leverages a BEV World model to predict future BEV states for Trajectory Evaluation.<n>We validate our framework on the NAVSIM benchmark and the closed-loop Bench2Drive benchmark based on the CARLA simulator, achieving state-of-the-art performance.
arXiv Detail & Related papers (2025-04-02T17:47:23Z)
Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning [23.21761407287525]
End-to-end autonomous driving offers a streamlined alternative to the traditional modular pipeline. Deep Reinforcement Learning (DRL) has recently gained traction in this domain. We bridge this gap by mapping the DRL feature extraction network directly to the perception phase.
arXiv Detail & Related papers (2024-09-26T09:14:16Z)
DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.<n>Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.<n>Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z)
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving. Existing approaches in this field often assume precise and complete observational data. We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z)
BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous Driving [24.123577277806135]
We pioneer a novel behavior-aware trajectory prediction model (BAT) Our model consists of behavior-aware, interaction-aware, priority-aware, and position-aware modules. We evaluate BAT's performance across the Next Generation Simulation (NGSIM), Highway Drone (HighD), Roundabout Drone (RounD), and Macao Connected Autonomous Driving (MoCAD) datasets.
arXiv Detail & Related papers (2023-12-11T13:27:51Z)
Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting. We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them. We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.