Planning-oriented Autonomous Driving
- URL: http://arxiv.org/abs/2212.10156v2
- Date: Thu, 23 Mar 2023 16:26:08 GMT
- Title: Planning-oriented Autonomous Driving
- Authors: Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu,
Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, Lewei Lu, Xiaosong Jia, Qiang
Liu, Jifeng Dai, Yu Qiao, Hongyang Li
- Abstract summary: We argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car.
We introduce Unified Autonomous Driving (UniAD), a comprehensive framework that incorporates full-stack driving tasks in one network.
- Score: 60.93767791255728
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Modern autonomous driving system is characterized as modular tasks in
sequential order, i.e., perception, prediction, and planning. In order to
perform a wide diversity of tasks and achieve advanced-level intelligence,
contemporary approaches either deploy standalone models for individual tasks,
or design a multi-task paradigm with separate heads. However, they might suffer
from accumulative errors or deficient task coordination. Instead, we argue that
a favorable framework should be devised and optimized in pursuit of the
ultimate goal, i.e., planning of the self-driving car. Oriented at this, we
revisit the key components within perception and prediction, and prioritize the
tasks such that all these tasks contribute to planning. We introduce Unified
Autonomous Driving (UniAD), a comprehensive framework up-to-date that
incorporates full-stack driving tasks in one network. It is exquisitely devised
to leverage advantages of each module, and provide complementary feature
abstractions for agent interaction from a global perspective. Tasks are
communicated with unified query interfaces to facilitate each other toward
planning. We instantiate UniAD on the challenging nuScenes benchmark. With
extensive ablations, the effectiveness of using such a philosophy is proven by
substantially outperforming previous state-of-the-arts in all aspects. Code and
models are public.
Related papers
- DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.
Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.
Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive.
SparseDrive consists of a symmetric sparse perception module and a parallel motion planner.
For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z) - SparseAD: Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving [13.404790614427924]
We propose a Sparse query-centric paradigm for end-to-end Autonomous Driving.
We design a unified sparse architecture for perception tasks including detection, tracking, and online mapping.
On the challenging nuScenes dataset, SparseAD achieves SOTA full-task performance among end-to-end methods.
arXiv Detail & Related papers (2024-04-10T10:34:34Z) - Beyond One Model Fits All: Ensemble Deep Learning for Autonomous
Vehicles [16.398646583844286]
This study introduces three distinct neural network models corresponding to Mediated Perception, Behavior Reflex, and Direct Perception approaches.
Our architecture fuses information from the base, future latent vector prediction, and auxiliary task networks, using global routing commands to select appropriate action sub-networks.
arXiv Detail & Related papers (2023-12-10T04:40:02Z) - Video Task Decathlon: Unifying Image and Video Tasks in Autonomous
Driving [85.62076860189116]
Video Task Decathlon (VTD) includes ten representative image and video tasks spanning classification, segmentation, localization, and association of objects and pixels.
We develop our unified network, VTDNet, that uses a single structure and a single set of weights for all ten tasks.
arXiv Detail & Related papers (2023-09-08T16:33:27Z) - Visual Exemplar Driven Task-Prompting for Unified Perception in
Autonomous Driving [100.3848723827869]
We present an effective multi-task framework, VE-Prompt, which introduces visual exemplars via task-specific prompting.
Specifically, we generate visual exemplars based on bounding boxes and color-based markers, which provide accurate visual appearances of target categories.
We bridge transformer-based encoders and convolutional layers for efficient and accurate unified perception in autonomous driving.
arXiv Detail & Related papers (2023-03-03T08:54:06Z) - CERBERUS: Simple and Effective All-In-One Automotive Perception Model
with Multi Task Learning [4.622165486890318]
In-vehicle embedded computing platforms cannot cope with the computational effort required to run a heavy model for each individual task.
We present CERBERUS, a lightweight model that leverages a multitask-learning approach to enable the execution of multiple perception tasks at the cost of a single inference.
arXiv Detail & Related papers (2022-10-03T08:17:26Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.