Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
- URL: http://arxiv.org/abs/2405.17426v1
- Date: Mon, 27 May 2024 17:59:39 GMT
- Title: Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving
- Authors: Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu,
- Abstract summary: We present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms.
We assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction.
Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data.
- Score: 55.93813178692077
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent advancements in bird's eye view (BEV) representations have shown remarkable promise for in-vehicle 3D perception. However, while these methods have achieved impressive results on standard benchmarks, their robustness in varied conditions remains insufficiently assessed. In this study, we present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms. This suite incorporates a diverse set of camera corruption types, each examined over three severity levels. Our benchmarks also consider the impact of complete sensor failures that occur when using multi-modal models. Through RoboBEV, we assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction. Our analyses reveal a noticeable correlation between the model's performance on in-distribution datasets and its resilience to out-of-distribution challenges. Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data. Furthermore, we observe that leveraging extensive temporal information significantly improves the model's robustness. Based on our observations, we design an effective robustness enhancement strategy based on the CLIP model. The insights from this study pave the way for the development of future BEV models that seamlessly combine accuracy with real-world robustness.
Related papers
- Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks [10.193504550494486]
This paper introduces a benchmark for predictive uncertainty quantification in BEV segmentation.
It focuses on the effectiveness of predicted uncertainty in identifying misclassified and out-of-distribution pixels, as well as calibration.
We propose the Uncertainty-Focal-Cross-Entropy loss, designed for highly imbalanced data, which consistently improves the segmentation quality and calibration.
arXiv Detail & Related papers (2024-05-31T16:32:46Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs)
Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV.
Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z) - Instance-aware Multi-Camera 3D Object Detection with Structural Priors
Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z) - Exploring the Physical World Adversarial Robustness of Vehicle Detection [13.588120545886229]
Adrial attacks can compromise the robustness of real-world detection models.
We propose an innovative instant-level data generation pipeline using the CARLA simulator.
Our findings highlight diverse model performances under adversarial conditions.
arXiv Detail & Related papers (2023-08-07T11:09:12Z) - OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution
Shifts of Individual Nuisances in Natural Images [59.51657161097337]
OOD-CV-v2 is a benchmark dataset that includes out-of-distribution examples of 10 object categories in terms of pose, shape, texture, context and the weather conditions.
In addition to this novel dataset, we contribute extensive experiments using popular baseline methods.
arXiv Detail & Related papers (2023-04-17T20:39:25Z) - RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions [34.111443808494506]
We introduce RoboBEV, a comprehensive benchmark suite that encompasses eight distinct corruptions, including Bright, Dark, Fog, Snow, Motion Blur, Color Quant, Camera Crash, and Frame Lost.
Based on it, we undertake extensive evaluations across a wide range of BEV-based models to understand their resilience and reliability.
Our findings provide valuable insights for designing future BEV models that can achieve both accuracy and robustness in real-world deployments.
arXiv Detail & Related papers (2023-04-13T17:59:46Z) - Robo3D: Towards Robust and Reliable 3D Perception against Corruptions [58.306694836881235]
We present Robo3D, the first comprehensive benchmark heading toward probing the robustness of 3D detectors and segmentors under out-of-distribution scenarios.
We consider eight corruption types stemming from severe weather conditions, external disturbances, and internal sensor failure.
We propose a density-insensitive training framework along with a simple flexible voxelization strategy to enhance the model resiliency.
arXiv Detail & Related papers (2023-03-30T17:59:17Z) - Understanding the Robustness of 3D Object Detection with Bird's-Eye-View
Representations in Autonomous Driving [31.98600806479808]
Bird's-Eye-View (BEV) representations have significantly improved the performance of 3D detectors with camera inputs on popular benchmarks.
We evaluate the natural and adversarial robustness of various representative models under extensive settings.
We propose a 3D consistent patch attack by applying adversarial patches in thetemporal 3D space to guarantee the consistency.
arXiv Detail & Related papers (2023-03-30T11:16:58Z) - DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception [14.968177102647783]
We propose an end-to-end framework, named DiffBEV, to exploit the potential of diffusion model to generate a more comprehensive BEV representation.
In practice, we design three types of conditions to guide the training of the diffusion model which denoises the coarse samples and refines the semantic feature.
We show that DiffBEV achieves a 25.9% mIoU on the nuScenes dataset, which is 6.2% higher than the best-performing existing approach.
arXiv Detail & Related papers (2023-03-15T02:42:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.