VAD: Vectorized Scene Representation for Efficient Autonomous Driving
- URL: http://arxiv.org/abs/2303.12077v3
- Date: Thu, 24 Aug 2023 08:15:35 GMT
- Title: VAD: Vectorized Scene Representation for Efficient Autonomous Driving
- Authors: Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong
Zhou, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang
- Abstract summary: VAD is an end-to-end vectorized paradigm for autonomous driving.
VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints.
VAD runs much faster than previous end-to-end planning methods.
- Score: 44.070636456960045
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Autonomous driving requires a comprehensive understanding of the surrounding
environment for reliable trajectory planning. Previous works rely on dense
rasterized scene representation (e.g., agent occupancy and semantic map) to
perform planning, which is computationally intensive and misses the
instance-level structure information. In this paper, we propose VAD, an
end-to-end vectorized paradigm for autonomous driving, which models the driving
scene as a fully vectorized representation. The proposed vectorized paradigm
has two significant advantages. On one hand, VAD exploits the vectorized agent
motion and map elements as explicit instance-level planning constraints which
effectively improves planning safety. On the other hand, VAD runs much faster
than previous end-to-end planning methods by getting rid of
computation-intensive rasterized representation and hand-designed
post-processing steps. VAD achieves state-of-the-art end-to-end planning
performance on the nuScenes dataset, outperforming the previous best method by
a large margin. Our base model, VAD-Base, greatly reduces the average collision
rate by 29.0% and runs 2.5x faster. Besides, a lightweight variant, VAD-Tiny,
greatly improves the inference speed (up to 9.3x) while achieving comparable
planning performance. We believe the excellent performance and the high
efficiency of VAD are critical for the real-world deployment of an autonomous
driving system. Code and models are available at https://github.com/hustvl/VAD
for facilitating future research.
Related papers
- DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Autonomous Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving.
Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner.
Experiments conducted on nuScenes dataset demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - End-to-End Autonomous Driving without Costly Modularization and 3D Manual Annotation [34.070813293944944]
We propose UAD, a method for vision-based end-to-end autonomous driving (E2EAD)
Our motivation stems from the observation that current E2EAD models still mimic the modular architecture in typical driving stacks.
Our UAD achieves 38.7% relative improvements over UniAD on the average collision rate in nuScenes and surpasses VAD for 41.32 points on the driving score in CARLA's Town05 Long benchmark.
arXiv Detail & Related papers (2024-06-25T16:12:52Z) - SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation [11.011219709863875]
We propose a new end-to-end autonomous driving paradigm named SparseDrive.
SparseDrive consists of a symmetric sparse perception module and a parallel motion planner.
For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner.
arXiv Detail & Related papers (2024-05-30T02:13:56Z) - VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic
Planning [42.681012361021224]
VADv2 is an end-to-end driving model based on probabilistic planning.
It runs stably in a fully end-to-end manner, even without the rule-based wrapper.
arXiv Detail & Related papers (2024-02-20T18:55:09Z) - Trajectory Prediction with Observations of Variable-Length for Motion
Planning in Highway Merging scenarios [5.193470362635256]
Existing methods cannot initiate prediction for a vehicle unless observed for a fixed duration of two or more seconds.
This paper proposes a novel transformer-based trajectory prediction approach, specifically trained to handle any observation length larger than one frame.
We perform a comprehensive evaluation of the proposed method using two large-scale highway trajectory datasets.
arXiv Detail & Related papers (2023-06-08T18:03:48Z) - GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting [121.42898228997538]
We propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization.
We leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph.
Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction.
arXiv Detail & Related papers (2022-11-04T16:10:50Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z) - VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized
Representation [74.56282712099274]
This paper introduces VectorNet, a hierarchical graph neural network that exploits the spatial locality of individual road components represented by vectors.
By operating on the vectorized high definition (HD) maps and agent trajectories, we avoid lossy rendering and computationally intensive ConvNet encoding steps.
We evaluate VectorNet on our in-house behavior prediction benchmark and the recently released Argoverse forecasting dataset.
arXiv Detail & Related papers (2020-05-08T19:07:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.