Related papers: DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving

DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving

URL: http://arxiv.org/abs/2409.09777v3
Date: Tue, 26 Nov 2024 04:25:13 GMT
Title: DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving
Authors: Haisheng Su, Wei Wu, Junchi Yan,
Abstract summary: We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving. Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner. Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
Score: 55.53171248839489
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current end-to-end autonomous driving methods resort to unifying modular designs for various tasks (e.g. perception, prediction and planning). Although optimized in a planning-oriented spirit with a fully differentiable framework, existing end-to-end driving systems without ego-centric designs still suffer from unsatisfactory performance and inferior efficiency, owing to the rasterized scene representation learning and redundant information transmission. In this paper, we revisit the human driving behavior and propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving. Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner. The sparse perception module performs detection, tracking and online mapping based on sparse representation of the driving scene. The hierarchical interaction module aims to select the Closest In-Path Vehicle / Stationary (CIPV / CIPS) from coarse to fine, benefiting from an additional geometric prior. As for the iterative motion planner, both selected interactive agents and ego-vehicle are considered for joint motion prediction, where the output multi-modal ego-trajectories are optimized in an iterative fashion. Besides, both position-level motion diffusion and trajectory-level planning denoising are introduced for uncertainty modeling, thus facilitating the training stability and convergence of the whole framework. Extensive experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.

Related papers

Data Scaling Laws for End-to-End Autonomous Driving [83.85463296830743]
We evaluate the performance of a simple end-to-end driving architecture on internal driving datasets ranging in size from 16 to 8192 hours. Specifically, we investigate how much additional training data is needed to achieve a target performance gain.
arXiv Detail & Related papers (2025-04-06T03:23:48Z)
HiP-AD: Hierarchical and Multi-Granularity Planning with Deformable Attention for Autonomous Driving in a Single Decoder [3.0989923815412204]
We propose a novel end-to-end autonomous driving framework, termed HiP-AD. HiP-AD simultaneously performs perception, prediction, and planning within a unified decoder. Experiments demonstrate that HiP-AD outperforms all existing end-to-end autonomous driving methods on the closed-loop benchmark Bench2Drive.
arXiv Detail & Related papers (2025-03-11T16:52:45Z)
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving [62.62464518137153]
DriveTransformer is a simplified E2E-AD framework for the ease of scaling up. It is composed of three unified operations: task self-attention, sensor cross-attention, temporal cross-attention. It achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nuScenes with high FPS.
arXiv Detail & Related papers (2025-03-07T11:41:18Z)
End-to-End Predictive Planner for Autonomous Driving with Consistency Models [5.966385886363771]
Trajectory prediction and planning are fundamental components for autonomous vehicles to navigate safely and efficiently in dynamic environments. Traditionally, these components have often been treated as separate modules, limiting the ability to perform interactive planning. We present a novel unified and data-driven framework that integrates prediction and planning with a single consistency model.
arXiv Detail & Related papers (2025-02-12T00:26:01Z)
VLM-AD: End-to-End Autonomous Driving through Vision-Language Model Supervision [20.43366384946928]
Vision-language models (VLMs) as teachers to enhance training. VLM-AD achieves significant improvements in planning accuracy and reduced collision rates on the nuScenes dataset.
arXiv Detail & Related papers (2024-12-19T01:53:36Z)
Enhancing End-to-End Autonomous Driving with Latent World Model [78.22157677787239]
We propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels. Our framework textbfLAW uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame. As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.
arXiv Detail & Related papers (2024-06-12T17:59:21Z)
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive. SparseDrive consists of a symmetric sparse perception module and a parallel motion planner. For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z)
PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving [57.89801036693292]
PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving) considers the timestep-wise interaction to better integrate prediction and planning. We design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions.
arXiv Detail & Related papers (2023-11-14T11:53:24Z)
End-to-end Autonomous Driving: Challenges and Frontiers [45.391430626264764]
We provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving. We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others. We discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework.
arXiv Detail & Related papers (2023-06-29T14:17:24Z)
VAD: Vectorized Scene Representation for Efficient Autonomous Driving [44.070636456960045]
VAD is an end-to-end vectorized paradigm for autonomous driving. VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints. VAD runs much faster than previous end-to-end planning methods.
arXiv Detail & Related papers (2023-03-21T17:59:22Z)
Planning-oriented Autonomous Driving [60.93767791255728]
We argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car. We introduce Unified Autonomous Driving (UniAD), a comprehensive framework that incorporates full-stack driving tasks in one network.
arXiv Detail & Related papers (2022-12-20T10:47:53Z)
NMR: Neural Manifold Representation for Autonomous Driving [2.2596039727344452]
We propose a representation for autonomous driving that learns to infer semantics and predict way-points on a manifold over a finite horizon. We do this using an iterative attention mechanism applied on a latent high dimensional embedding of surround monocular images and partial ego-vehicle state. We propose a sampling algorithm based on edge-adaptive coverage loss of BEV occupancy grid to generate the surface manifold.
arXiv Detail & Related papers (2022-05-11T14:58:08Z)
End-to-End Interactive Prediction and Planning with Optical Flow Distillation for Autonomous Driving [16.340715765227475]
We propose an end-to-end interactive neural motion planner (INMP) for autonomous driving in this paper. Our INMP first generates a feature map in bird's-eye-view space, which is then processed to detect other agents and perform interactive prediction and planning jointly. Also, we adopt an optical flow distillation paradigm, which can effectively improve the network performance while still maintaining its real-time inference speed.
arXiv Detail & Related papers (2021-04-18T14:05:18Z)
IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment. Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z)
Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations [81.05412704590707]
We propose a novel end-to-end learnable network that performs joint perception, prediction and motion planning for self-driving vehicles. Our network is learned end-to-end from human demonstrations.
arXiv Detail & Related papers (2020-08-13T14:40:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.