VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic
Planning
- URL: http://arxiv.org/abs/2402.13243v1
- Date: Tue, 20 Feb 2024 18:55:09 GMT
- Title: VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic
Planning
- Authors: Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang,
Chang Huang, Wenyu Liu, Xinggang Wang
- Abstract summary: VADv2 is an end-to-end driving model based on probabilistic planning.
It runs stably in a fully end-to-end manner, even without the rule-based wrapper.
- Score: 42.681012361021224
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Learning a human-like driving policy from large-scale driving demonstrations
is promising, but the uncertainty and non-deterministic nature of planning make
it challenging. In this work, to cope with the uncertainty problem, we propose
VADv2, an end-to-end driving model based on probabilistic planning. VADv2 takes
multi-view image sequences as input in a streaming manner, transforms sensor
data into environmental token embeddings, outputs the probabilistic
distribution of action, and samples one action to control the vehicle. Only
with camera sensors, VADv2 achieves state-of-the-art closed-loop performance on
the CARLA Town05 benchmark, significantly outperforming all existing methods.
It runs stably in a fully end-to-end manner, even without the rule-based
wrapper. Closed-loop demos are presented at https://hgao-cv.github.io/VADv2.
Related papers
- Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles [9.639797094021988]
We introduce Imagine-2-Drive, a framework that consists of two components, VISTAPlan and DPA.
DPA is a diffusion based policy to model multi-modal behaviors for trajectory prediction.
We significantly outperform the state of the art (SOTA) world models on standard driving metrics by 15% and 20% on Route Completion and Success Rate respectively.
arXiv Detail & Related papers (2024-11-15T13:17:54Z) - Conformal Trajectory Prediction with Multi-View Data Integration in Cooperative Driving [4.628774934971078]
Current research on trajectory prediction primarily relies on data collected by onboard sensors of an ego vehicle.
We introduce V2INet, a novel trajectory prediction framework designed to model multi-view data by extending existing single-view models.
Our results demonstrate superior performance in terms of Final Displacement Error (FDE) and Miss Rate (MR) using a single GPU.
arXiv Detail & Related papers (2024-08-01T08:32:03Z) - BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space [57.68134574076005]
We present BEVWorld, a novel approach that tokenizes multimodal sensor inputs into a unified and compact Bird's Eye View latent space for environment modeling.
Experiments demonstrate the effectiveness of BEVWorld in autonomous driving tasks, showcasing its capability in generating future scenes and benefiting downstream tasks such as perception and motion prediction.
arXiv Detail & Related papers (2024-07-08T07:26:08Z) - Planning with Adaptive World Models for Autonomous Driving [50.4439896514353]
Motion planners (MPs) are crucial for safe navigation in complex urban environments.
nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic.
We present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions.
arXiv Detail & Related papers (2024-06-15T18:53:45Z) - DeepAccident: A Motion and Accident Prediction Benchmark for V2X
Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving.
The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z) - VAD: Vectorized Scene Representation for Efficient Autonomous Driving [44.070636456960045]
VAD is an end-to-end vectorized paradigm for autonomous driving.
VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints.
VAD runs much faster than previous end-to-end planning methods.
arXiv Detail & Related papers (2023-03-21T17:59:22Z) - Generating Evidential BEV Maps in Continuous Driving Space [13.073542165482566]
We propose a complete probabilistic model named GevBEV.
It interprets the 2D driving space as a probabilistic Bird's Eye View (BEV) map with point-based spatial Gaussian distributions.
GevBEV helps reduce communication overhead by selecting only the most important information to share from the learned uncertainty.
arXiv Detail & Related papers (2023-02-06T17:05:50Z) - Policy Pre-training for End-to-end Autonomous Driving via
Self-supervised Geometric Modeling [96.31941517446859]
We propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving.
We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos.
In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input.
In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only.
arXiv Detail & Related papers (2023-01-03T08:52:49Z) - IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment.
Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z) - PillarFlow: End-to-end Birds-eye-view Flow Estimation for Autonomous
Driving [42.8479177012748]
We propose an end-to-end deep learning framework for LIDAR-based flow estimation in bird's eye view (BeV)
Our method takes consecutive point cloud pairs as input and produces a 2-D BeV flow grid describing the dynamic state of each cell.
The experimental results show that the proposed method not only estimates 2-D BeV flow accurately but also improves tracking performance of both dynamic and static objects.
arXiv Detail & Related papers (2020-08-03T20:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.