Planning-oriented Autonomous Driving
- URL: http://arxiv.org/abs/2212.10156v2
- Date: Thu, 23 Mar 2023 16:26:08 GMT
- Title: Planning-oriented Autonomous Driving
- Authors: Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu,
Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, Lewei Lu, Xiaosong Jia, Qiang
Liu, Jifeng Dai, Yu Qiao, Hongyang Li
- Abstract summary: We argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car.
We introduce Unified Autonomous Driving (UniAD), a comprehensive framework that incorporates full-stack driving tasks in one network.
- Score: 60.93767791255728
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Modern autonomous driving system is characterized as modular tasks in
sequential order, i.e., perception, prediction, and planning. In order to
perform a wide diversity of tasks and achieve advanced-level intelligence,
contemporary approaches either deploy standalone models for individual tasks,
or design a multi-task paradigm with separate heads. However, they might suffer
from accumulative errors or deficient task coordination. Instead, we argue that
a favorable framework should be devised and optimized in pursuit of the
ultimate goal, i.e., planning of the self-driving car. Oriented at this, we
revisit the key components within perception and prediction, and prioritize the
tasks such that all these tasks contribute to planning. We introduce Unified
Autonomous Driving (UniAD), a comprehensive framework up-to-date that
incorporates full-stack driving tasks in one network. It is exquisitely devised
to leverage advantages of each module, and provide complementary feature
abstractions for agent interaction from a global perspective. Tasks are
communicated with unified query interfaces to facilitate each other toward
planning. We instantiate UniAD on the challenging nuScenes benchmark. With
extensive ablations, the effectiveness of using such a philosophy is proven by
substantially outperforming previous state-of-the-arts in all aspects. Code and
models are public.
Related papers
- SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive.
SparseDrive consists of a symmetric sparse perception module and a parallel motion planner.
For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z) - SparseAD: Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving [13.404790614427924]
We propose a Sparse query-centric paradigm for end-to-end Autonomous Driving.
We design a unified sparse architecture for perception tasks including detection, tracking, and online mapping.
On the challenging nuScenes dataset, SparseAD achieves SOTA full-task performance among end-to-end methods.
arXiv Detail & Related papers (2024-04-10T10:34:34Z) - Beyond One Model Fits All: Ensemble Deep Learning for Autonomous
Vehicles [16.398646583844286]
This study introduces three distinct neural network models corresponding to Mediated Perception, Behavior Reflex, and Direct Perception approaches.
Our architecture fuses information from the base, future latent vector prediction, and auxiliary task networks, using global routing commands to select appropriate action sub-networks.
arXiv Detail & Related papers (2023-12-10T04:40:02Z) - PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving [57.89801036693292]
PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving) considers the timestep-wise interaction to better integrate prediction and planning.
We design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions.
arXiv Detail & Related papers (2023-11-14T11:53:24Z) - Video Task Decathlon: Unifying Image and Video Tasks in Autonomous
Driving [85.62076860189116]
Video Task Decathlon (VTD) includes ten representative image and video tasks spanning classification, segmentation, localization, and association of objects and pixels.
We develop our unified network, VTDNet, that uses a single structure and a single set of weights for all ten tasks.
arXiv Detail & Related papers (2023-09-08T16:33:27Z) - Visual Exemplar Driven Task-Prompting for Unified Perception in
Autonomous Driving [100.3848723827869]
We present an effective multi-task framework, VE-Prompt, which introduces visual exemplars via task-specific prompting.
Specifically, we generate visual exemplars based on bounding boxes and color-based markers, which provide accurate visual appearances of target categories.
We bridge transformer-based encoders and convolutional layers for efficient and accurate unified perception in autonomous driving.
arXiv Detail & Related papers (2023-03-03T08:54:06Z) - CERBERUS: Simple and Effective All-In-One Automotive Perception Model
with Multi Task Learning [4.622165486890318]
In-vehicle embedded computing platforms cannot cope with the computational effort required to run a heavy model for each individual task.
We present CERBERUS, a lightweight model that leverages a multitask-learning approach to enable the execution of multiple perception tasks at the cost of a single inference.
arXiv Detail & Related papers (2022-10-03T08:17:26Z) - Autonomous Open-Ended Learning of Tasks with Non-Stationary
Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals.
While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks.
In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture.
Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.