Related papers: Improving Robustness of LiDAR-Camera Fusion Model against Weather Corruption from Fusion Strategy Perspective

Improving Robustness of LiDAR-Camera Fusion Model against Weather Corruption from Fusion Strategy Perspective

URL: http://arxiv.org/abs/2402.02738v1
Date: Mon, 5 Feb 2024 05:38:50 GMT
Title: Improving Robustness of LiDAR-Camera Fusion Model against Weather Corruption from Fusion Strategy Perspective
Authors: Yihao Huang, Kaiyuan Yu, Qing Guo, Felix Juefei-Xu, Xiaojun Jia, Tianlin Li, Geguang Pu, Yang Liu
Abstract summary: LiDAR-camera fusion models have advanced 3D object detection tasks in autonomous driving. robustness against common weather corruption such as fog, rain, snow, and sunlight in the intricate physical world remains underexplored. We propose a concise yet practical fusion strategy to enhance the robustness of the fusion models.
Score: 26.391161934274876
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, LiDAR-camera fusion models have markedly advanced 3D object detection tasks in autonomous driving. However, their robustness against common weather corruption such as fog, rain, snow, and sunlight in the intricate physical world remains underexplored. In this paper, we evaluate the robustness of fusion models from the perspective of fusion strategies on the corrupted dataset. Based on the evaluation, we further propose a concise yet practical fusion strategy to enhance the robustness of the fusion models, namely flexibly weighted fusing features from LiDAR and camera sources to adapt to varying weather scenarios. Experiments conducted on four types of fusion models, each with two distinct lightweight implementations, confirm the broad applicability and effectiveness of the approach.

Related papers

Diffusion-Based Restoration for Multi-Modal 3D Object Detection in Adverse Weather [15.57759675028067]
DiffFusion is a novel framework designed to enhance robustness in challenging weather.<n>Our key insight is that diffusion models possess strong capabilities for denoising and generating data.<n> implementation of our DiffFusion will be released as open-source.
arXiv Detail & Related papers (2025-12-15T09:03:46Z)
Progressive Multi-Modal Fusion for Robust 3D Object Detection [12.048303829428452]
Existing methods perform sensor fusion in a single view by projecting features from both modalities either in Bird's Eye View (BEV) or Perspective View (PV) We propose ProFusion3D, a progressive fusion framework that combines features in both BEV and PV at both intermediate and object query levels. Our architecture hierarchically fuses local and global features, enhancing the robustness of 3D object detection.
arXiv Detail & Related papers (2024-10-09T22:57:47Z)
ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions [1.7537812081430004]
We propose a technique called ContextualFusion to incorporate the domain knowledge about cameras and lidars behaving differently across lighting and weather variations into 3D object detection models. Our approach yields an mAP improvement of 6.2% over state-of-the-art methods on our context-balanced synthetic dataset. Our method enhances state-of-the-art 3D objection performance at night on the real-world NuScenes dataset with a significant mAP improvement of 11.7%.
arXiv Detail & Related papers (2024-04-23T06:37:54Z)
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection [22.683446326326898]
SupFusion provides auxiliary feature-level supervision for effective LiDAR-Camera fusion. Deep fusion module contiguously gains superior performance compared with previous fusion methods. We gain around 2% 3D mAP improvements on KITTI benchmark based on multiple LiDAR-Camera 3D detectors.
arXiv Detail & Related papers (2023-09-13T16:52:23Z)
MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection [54.52102265418295]
We propose a novel and effective Multi-Level Fusion network, named as MLF-DET, for high-performance cross-modal 3D object DETection. For the feature-level fusion, we present the Multi-scale Voxel Image fusion (MVI) module, which densely aligns multi-scale voxel features with image features. For the decision-level fusion, we propose the lightweight Feature-cued Confidence Rectification (FCR) module, which exploits image semantics to rectify the confidence of detection candidates.
arXiv Detail & Related papers (2023-07-18T11:26:02Z)
Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection [33.0406308223244]
We propose an attack framework that targets advanced camera-LiDAR fusion-based 3D object detection models through camera-only adversarial attacks. Our approach employs a two-stage optimization-based strategy that first thoroughly evaluates vulnerable image areas under adversarial attacks.
arXiv Detail & Related papers (2023-04-28T03:39:00Z)
CrossFusion: Interleaving Cross-modal Complementation for Noise-resistant 3D Object Detection [7.500487420385808]
We propose a more robust and noise-resistant scheme that makes full use of the camera and LiDAR features with the designed cross-modal complementation strategy. Our method not only outperforms the state-of-the-art methods under the setting but also demonstrates our model's noise resistance without re-training for the specific malfunction scenarios.
arXiv Detail & Related papers (2023-04-19T14:35:16Z)
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection [89.26380781863665]
Fusing LiDAR and camera information is essential for achieving accurate and reliable 3D object detection in autonomous driving systems. Recent approaches aim at exploring the semantic densities of camera features through lifting points in 2D camera images into 3D space for fusion. We propose a novel framework that focuses on the multi-scale progressive interaction of the multi-granularity LiDAR and camera features.
arXiv Detail & Related papers (2022-09-07T12:29:29Z)
Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR. fusing these two modalities can significantly boost the performance of 3D perception models. We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z)
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions. TransFusion achieves state-of-the-art performance on large-scale datasets. We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z)
EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection [56.03081616213012]
We propose EPNet++ for multi-modal 3D object detection by introducing a novel Cascade Bi-directional Fusion(CB-Fusion) module. The proposed CB-Fusion module boosts the plentiful semantic information of point features with the image features in a cascade bi-directional interaction fusion manner. The experiment results on the KITTI, JRDB and SUN-RGBD datasets demonstrate the superiority of EPNet++ over the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-21T10:48:34Z)
LIF-Seg: LiDAR and Camera Image Fusion for 3D LiDAR Semantic Segmentation [78.74202673902303]
We propose a coarse-tofine LiDAR and camera fusion-based network (termed as LIF-Seg) for LiDAR segmentation. The proposed method fully utilizes the contextual information of images and introduces a simple but effective early-fusion strategy. The cooperation of these two components leads to the success of the effective camera-LiDAR fusion.
arXiv Detail & Related papers (2021-08-17T08:53:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.