BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework
- URL: http://arxiv.org/abs/2205.13790v1
- Date: Fri, 27 May 2022 06:58:30 GMT
- Title: BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework
- Authors: Tingting Liang, Hongwei Xie, Kaicheng Yu, Zhongyu Xia, Zhiwei Lin,
Yongtao Wang, Tao Tang, Bing Wang, Zhi Tang
- Abstract summary: Current methods rely on point clouds from the LiDAR sensor as queries to leverage the feature from the image space.
We propose a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data.
We empirically show that our framework surpasses the state-of-the-art methods under the normal training settings.
- Score: 20.842800465250775
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Fusing the camera and LiDAR information has become a de-facto standard for 3D
object detection tasks. Current methods rely on point clouds from the LiDAR
sensor as queries to leverage the feature from the image space. However, people
discover that this underlying assumption makes the current fusion framework
infeasible to produce any prediction when there is a LiDAR malfunction,
regardless of minor or major. This fundamentally limits the deployment
capability to realistic autonomous driving scenarios. In contrast, we propose a
surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera
stream does not depend on the input of LiDAR data, thus addressing the downside
of previous methods. We empirically show that our framework surpasses the
state-of-the-art methods under the normal training settings. Under the
robustness training settings that simulate various LiDAR malfunctions, our
framework significantly surpasses the state-of-the-art methods by 15.7% to
28.9% mAP. To the best of our knowledge, we are the first to handle realistic
LiDAR malfunction and can be deployed to realistic scenarios without any
post-processing procedure. The code is available at
https://github.com/ADLab-AutoDrive/BEVFusion.
Related papers
- LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
LiDAR simulation plays a crucial role in closed-loop simulation for autonomous driving.
We present LiDAR-GS, the first LiDAR Gaussian Splatting method, for real-time high-fidelity re-simulation of LiDAR sensor scans in public urban road scenes.
Our approach succeeds in simultaneously re-simulating depth, intensity, and ray-drop channels, achieving state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - Zero-Reference Low-Light Enhancement via Physical Quadruple Priors [58.77377454210244]
We propose a new zero-reference low-light enhancement framework trainable solely with normal light images.
This framework is able to restore our illumination-invariant prior back to images, automatically achieving low-light enhancement.
arXiv Detail & Related papers (2024-03-19T17:36:28Z) - UltraLiDAR: Learning Compact Representations for LiDAR Completion and
Generation [51.443788294845845]
We present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation.
We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds.
By learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving.
arXiv Detail & Related papers (2023-11-02T17:57:03Z) - LiDAR-UDA: Self-ensembling Through Time for Unsupervised LiDAR Domain
Adaptation [22.206488779765234]
We introduce LiDAR-UDA, a novel two-stage self-training-based Unsupervised Domain Adaptation (UDA) method for LiDAR segmentation.
We propose two techniques to reduce sensor discrepancy and improve pseudo label quality.
We evaluate our method on several public LiDAR datasets and show that it outperforms the state-of-the-art methods by more than $3.9%$ mIoU on average.
arXiv Detail & Related papers (2023-09-24T02:02:00Z) - LiDAR View Synthesis for Robust Vehicle Navigation Without Expert Labels [50.40632021583213]
We propose synthesizing additional LiDAR point clouds from novel viewpoints without physically driving at dangerous positions.
We train a deep learning model, which takes a LiDAR scan as input and predicts the future trajectory as output.
A waypoint controller is then applied to this predicted trajectory to determine the throttle and steering labels of the ego-vehicle.
arXiv Detail & Related papers (2023-08-02T20:46:43Z) - MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features [11.28654979274464]
In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data.
We introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications.
arXiv Detail & Related papers (2023-06-12T13:01:33Z) - Online LiDAR-Camera Extrinsic Parameters Self-checking [12.067216966113708]
This paper proposes a self-checking algorithm to judge whether the extrinsic parameters are well-calibrated by introducing a binary classification network.
The code is open-sourced on the Github website at https://github.com/OpenCalib/LiDAR2camera_self-check.
arXiv Detail & Related papers (2022-10-19T13:17:48Z) - Multi-modal Streaming 3D Object Detection [20.01800869678355]
We propose an innovative camera-LiDAR streaming 3D object detection framework.
It uses camera images instead of past LiDAR slices to provide an up-to-date, dense, and wide context for streaming perception.
Our method is shown to be robust to missing camera images, narrow LiDAR slices, and small camera-LiDAR miscalibration.
arXiv Detail & Related papers (2022-09-12T00:30:52Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - LIF-Seg: LiDAR and Camera Image Fusion for 3D LiDAR Semantic
Segmentation [78.74202673902303]
We propose a coarse-tofine LiDAR and camera fusion-based network (termed as LIF-Seg) for LiDAR segmentation.
The proposed method fully utilizes the contextual information of images and introduces a simple but effective early-fusion strategy.
The cooperation of these two components leads to the success of the effective camera-LiDAR fusion.
arXiv Detail & Related papers (2021-08-17T08:53:11Z) - DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation
Using Monocular Camera and Sparse LiDAR [10.303618438296981]
Scene flow is the dense 3D reconstruction of motion and geometry of a scene.
Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction.
DeepLiDARFlow is a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales.
arXiv Detail & Related papers (2020-08-18T19:51:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.