BroadBEV: Collaborative LiDAR-camera Fusion for Broad-sighted Bird's Eye
View Map Construction
- URL: http://arxiv.org/abs/2309.11119v4
- Date: Wed, 8 Nov 2023 11:18:24 GMT
- Title: BroadBEV: Collaborative LiDAR-camera Fusion for Broad-sighted Bird's Eye
View Map Construction
- Authors: Minsu Kim, Giseop Kim, Kyong Hwan Jin, Sunwook Choi
- Abstract summary: We propose a broad BEV fusion (BroadBEV) that addresses the problems with a spatial synchronization approach of cross-modality.
Our strategy aims to enhance camera BEV estimation for a broad-sighted perception while simultaneously improving the completion of LiDAR's sparsity in the entire BEV space.
- Score: 31.664613321775516
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A recent sensor fusion in a Bird's Eye View (BEV) space has shown its utility
in various tasks such as 3D detection, map segmentation, etc. However, the
approach struggles with inaccurate camera BEV estimation, and a perception of
distant areas due to the sparsity of LiDAR points. In this paper, we propose a
broad BEV fusion (BroadBEV) that addresses the problems with a spatial
synchronization approach of cross-modality. Our strategy aims to enhance camera
BEV estimation for a broad-sighted perception while simultaneously improving
the completion of LiDAR's sparsity in the entire BEV space. Toward that end, we
devise Point-scattering that scatters LiDAR BEV distribution to camera depth
distribution. The method boosts the learning of depth estimation of the camera
branch and induces accurate location of dense camera features in BEV space. For
an effective BEV fusion between the spatially synchronized features, we suggest
ColFusion that applies self-attention weights of LiDAR and camera BEV features
to each other. Our extensive experiments demonstrate that BroadBEV provides a
broad-sighted BEV perception with remarkable performance gains.
Related papers
- Robust Bird's Eye View Segmentation by Adapting DINOv2 [3.236198583140341]
We adapt a vision foundational model, DINOv2, to BEV estimation using Low Rank Adaptation (LoRA)
Our experiments show increased robustness of BEV perception under various corruptions.
We also showcase the effectiveness of the adapted representations in terms of fewer learnable parameters and faster convergence during training.
arXiv Detail & Related papers (2024-09-16T12:23:35Z) - OE-BevSeg: An Object Informed and Environment Aware Multimodal Framework for Bird's-eye-view Vehicle Semantic Segmentation [57.2213693781672]
Bird's-eye-view (BEV) semantic segmentation is becoming crucial in autonomous driving systems.
We propose OE-BevSeg, an end-to-end multimodal framework that enhances BEV segmentation performance.
Our approach achieves state-of-the-art results by a large margin on the nuScenes dataset for vehicle segmentation.
arXiv Detail & Related papers (2024-07-18T03:48:22Z) - TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation [9.723276622743473]
We develop a novel temporal BEV encoder, TempBEV, which integrates aggregated temporal information from both latent spaces.
Empirical evaluation on the NuScenes dataset shows a significant improvement by TempBEV over the baseline for 3D object detection and BEV segmentation.
arXiv Detail & Related papers (2024-04-17T23:49:00Z) - GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection [18.21607858133675]
We propose a robust fusion framework called Graph BEV to integrate LiDAR and camera BEV features.
Our framework outperforms BEV Fusion by 8.3% under conditions with misalignment noise.
arXiv Detail & Related papers (2024-03-18T15:00:38Z) - DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception [104.87876441265593]
Camera-only Bird's Eye View (BEV) has demonstrated great potential in environment perception in a 3D space.
Unsupervised domain adaptive BEV, which effective learning from various unlabelled target data, is far under-explored.
We design DA-BEV, the first domain adaptive camera-only BEV framework that addresses domain adaptive BEV challenges by exploiting the complementary nature of image-view features and BEV features.
arXiv Detail & Related papers (2024-01-13T04:21:24Z) - U-BEV: Height-aware Bird's-Eye-View Segmentation and Neural Map-based Relocalization [81.76044207714637]
Relocalization is essential for intelligent vehicles when GPS reception is insufficient or sensor-based localization fails.
Recent advances in Bird's-Eye-View (BEV) segmentation allow for accurate estimation of local scene appearance.
This paper presents U-BEV, a U-Net inspired architecture that extends the current state-of-the-art by allowing the BEV to reason about the scene on multiple height layers before flattening the BEV features.
arXiv Detail & Related papers (2023-10-20T18:57:38Z) - Leveraging BEV Representation for 360-degree Visual Place Recognition [14.497501941931759]
This paper investigates the advantages of using Bird's Eye View representation in 360-degree visual place recognition (VPR)
We propose a novel network architecture that utilizes the BEV representation in feature extraction, feature aggregation, and vision-LiDAR fusion.
The proposed BEV-based method is evaluated in ablation and comparative studies on two datasets.
arXiv Detail & Related papers (2023-05-23T08:29:42Z) - Delving into the Devils of Bird's-eye-view Perception: A Review,
Evaluation and Recipe [115.31507979199564]
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending and drawing extensive attention both from industry and academia.
As sensor configurations get more complex, integrating multi-source information from different sensors and representing features in a unified view come of vital importance.
The core problems for BEV perception lie in (a) how to reconstruct the lost 3D information via view transformation from perspective view to BEV; (b) how to acquire ground truth annotations in BEV grid; and (d) how to adapt and generalize algorithms as sensor configurations vary across different scenarios.
arXiv Detail & Related papers (2022-09-12T15:29:13Z) - BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation [105.96557764248846]
We introduce BEVFusion, a generic multi-task multi-sensor fusion framework.
It unifies multi-modal features in the shared bird's-eye view representation space.
It achieves 1.3% higher mAP and NDS on 3D object detection and 13.6% higher mIoU on BEV map segmentation, with 1.9x lower cost.
arXiv Detail & Related papers (2022-05-26T17:59:35Z) - M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified
Birds-Eye View Representation [145.6041893646006]
M$2$BEV is a unified framework that jointly performs 3D object detection and map segmentation.
M$2$BEV infers both tasks with a unified model and improves efficiency.
arXiv Detail & Related papers (2022-04-11T13:43:25Z) - BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera
Images via Spatiotemporal Transformers [39.253627257740085]
3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems.
We present a new framework termed BEVFormer, which learns unified BEV representations with transformers to support multiple autonomous driving perception tasks.
We show that BEVFormer remarkably improves the accuracy of velocity estimation and recall of objects under low visibility conditions.
arXiv Detail & Related papers (2022-03-31T17:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.