DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Autonomous Driving
- URL: http://arxiv.org/abs/2409.09777v1
- Date: Sun, 15 Sep 2024 15:55:24 GMT
- Title: DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Autonomous Driving
- Authors: Haisheng Su, Wei Wu, Junchi Yan,
- Abstract summary: We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.
Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.
Experiments conducted on nuScenes dataset demonstrate the superior planning performance and great efficiency of DiFSD.
- Score: 55.53171248839489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current end-to-end autonomous driving methods resort to unifying modular designs for various tasks (e.g. perception, prediction and planning). Although optimized in a planning-oriented spirit with a fully differentiable framework, existing end-to-end driving systems without ego-centric designs still suffer from unsatisfactory performance and inferior efficiency, owing to the rasterized scene representation learning and redundant information transmission. In this paper, we revisit the human driving behavior and propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving. Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner. The sparse perception module performs detection, tracking and online mapping based on sparse representation of the driving scene. The hierarchical interaction module aims to select the Closest In-Path Vehicle / Stationary (CIPV / CIPS) from coarse to fine, benefiting from an additional geometric prior. As for the iterative motion planner, both selected interactive agents and ego-vehicle are considered for joint motion prediction, where the output multi-modal ego-trajectories are optimized in an iterative fashion. Besides, both position-level motion diffusion and trajectory-level planning denoising are introduced for uncertainty modeling, thus facilitating the training stability and convergence of the whole framework. Extensive experiments conducted on nuScenes dataset demonstrate the superior planning performance and great efficiency of DiFSD, which significantly reduces the average L2 error by \textbf{66\%} and collision rate by \textbf{77\%} than UniAD while achieves \textbf{8.2$\times$} faster running efficiency.
Related papers
- End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning [24.578178308010912]
We propose an end-to-end model-based RL algorithm named Ramble to address these issues.
By learning a dynamics model of the environment, Ramble can foresee upcoming traffic events and make more informed, strategic decisions.
Ramble achieves state-of-the-art performance regarding route completion rate and driving score on the CARLA Leaderboard 2.0, showcasing its effectiveness in managing complex and dynamic traffic situations.
arXiv Detail & Related papers (2024-10-03T06:45:59Z) - Enhancing End-to-End Autonomous Driving with Latent World Model [78.22157677787239]
We propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels.
Our framework textbfLAW uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame.
As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.
arXiv Detail & Related papers (2024-06-12T17:59:21Z) - SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive.
SparseDrive consists of a symmetric sparse perception module and a parallel motion planner.
For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z) - Pioneering SE(2)-Equivariant Trajectory Planning for Automated Driving [45.18582668677648]
Planning the trajectory of the controlled ego vehicle is a key challenge in automated driving.
We propose a lightweight equivariant planning model that generates multi-modal joint predictions for all vehicles.
We also propose equivariant route attraction to guide the ego vehicle along a high-level route provided by an off-the-shelf GPS navigation system.
arXiv Detail & Related papers (2024-03-17T18:53:46Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - VAD: Vectorized Scene Representation for Efficient Autonomous Driving [44.070636456960045]
VAD is an end-to-end vectorized paradigm for autonomous driving.
VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints.
VAD runs much faster than previous end-to-end planning methods.
arXiv Detail & Related papers (2023-03-21T17:59:22Z) - NMR: Neural Manifold Representation for Autonomous Driving [2.2596039727344452]
We propose a representation for autonomous driving that learns to infer semantics and predict way-points on a manifold over a finite horizon.
We do this using an iterative attention mechanism applied on a latent high dimensional embedding of surround monocular images and partial ego-vehicle state.
We propose a sampling algorithm based on edge-adaptive coverage loss of BEV occupancy grid to generate the surface manifold.
arXiv Detail & Related papers (2022-05-11T14:58:08Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - PnPNet: End-to-End Perception and Prediction with Tracking in the Loop [82.97006521937101]
We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles.
We propose Net, an end-to-end model that takes as input sensor data, and outputs at each time step object tracks and their future level.
arXiv Detail & Related papers (2020-05-29T17:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.