Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection
- URL: http://arxiv.org/abs/2502.01856v1
- Date: Mon, 03 Feb 2025 22:07:14 GMT
- Title: Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection
- Authors: Reza Sadeghian, Niloofar Hooshyaripour, Chris Joslin, WonSook Lee,
- Abstract summary: We propose ReliFusion, a LiDAR-camera fusion framework operating in the bird's-eye view (BEV) space.
ReliFusion integrates three key components: the Spatio-Temporal Feature Aggregation (STFA) module, the Reliability module, and the Confidence-Weighted Mutual Cross-Attention (CW-MCA) module.
Experiments on the nuScenes dataset show that ReliFusion significantly outperforms state-of-the-art methods, achieving superior robustness and accuracy in scenarios with limited LiDAR fields of view and severe sensor malfunctions.
- Score: 0.0
- License:
- Abstract: Accurate and robust 3D object detection is essential for autonomous driving, where fusing data from sensors like LiDAR and camera enhances detection accuracy. However, sensor malfunctions such as corruption or disconnection can degrade performance, and existing fusion models often struggle to maintain reliability when one modality fails. To address this, we propose ReliFusion, a novel LiDAR-camera fusion framework operating in the bird's-eye view (BEV) space. ReliFusion integrates three key components: the Spatio-Temporal Feature Aggregation (STFA) module, which captures dependencies across frames to stabilize predictions over time; the Reliability module, which assigns confidence scores to quantify the dependability of each modality under challenging conditions; and the Confidence-Weighted Mutual Cross-Attention (CW-MCA) module, which dynamically balances information from LiDAR and camera modalities based on these confidence scores. Experiments on the nuScenes dataset show that ReliFusion significantly outperforms state-of-the-art methods, achieving superior robustness and accuracy in scenarios with limited LiDAR fields of view and severe sensor malfunctions.
Related papers
- MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception [9.575044300747061]
Multi-sensor fusion models play a crucial role in autonomous driving perception, particularly in tasks like 3D object detection and HD map construction.
These models provide essential and comprehensive static environmental information for autonomous driving systems.
While camera-LiDAR fusion methods have shown promising results, they often depend on complete sensor inputs.
This reliance can lead to low robustness and potential failures when sensors are corrupted or missing, raising significant safety concerns.
To tackle this challenge, we introduce the Multi-Sensor Corruption Benchmark (MSC-Bench), the first comprehensive benchmark aimed at evaluating the robustness of multi-sensor autonomous driving perception models against various sensor corruption
arXiv Detail & Related papers (2025-01-02T03:38:46Z) - ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning [26.369237406972577]
Vision-centric semantic occupancy prediction plays a crucial role in autonomous driving.
There is still few research effort to explore the reliability in predicting semantic occupancy from camera.
We propose ReliOcc, a method designed to enhance the reliability of camera-based occupancy networks.
arXiv Detail & Related papers (2024-09-26T16:33:16Z) - Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble [15.173314907900842]
Existing 3D object detection methods rely heavily on the LiDAR sensor.
We propose MEFormer to address the LiDAR over-reliance problem.
Our MEFormer achieves state-of-the-art performance of 73.9% NDS and 71.5% mAP in the nuScenes validation set.
arXiv Detail & Related papers (2024-07-27T03:21:44Z) - Towards Stable 3D Object Detection [64.49059005467817]
Stability Index (SI) is a new metric that can comprehensively evaluate the stability of 3D detectors in terms of confidence, box localization, extent, and heading.
To help models improve their stability, we introduce a general and effective training strategy, called Prediction Consistency Learning (PCL)
PCL essentially encourages the prediction consistency of the same objects under different timestamps and augmentations, leading to enhanced detection stability.
arXiv Detail & Related papers (2024-07-05T07:17:58Z) - ShaSTA-Fuse: Camera-LiDAR Sensor Fusion to Model Shape and
Spatio-Temporal Affinities for 3D Multi-Object Tracking [26.976216624424385]
3D multi-object tracking (MOT) is essential for an autonomous mobile agent to safely navigate a scene.
We aim to develop a 3D MOT framework that fuses camera and LiDAR sensor information.
arXiv Detail & Related papers (2023-10-04T02:17:59Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Robo3D: Towards Robust and Reliable 3D Perception against Corruptions [58.306694836881235]
We present Robo3D, the first comprehensive benchmark heading toward probing the robustness of 3D detectors and segmentors under out-of-distribution scenarios.
We consider eight corruption types stemming from severe weather conditions, external disturbances, and internal sensor failure.
We propose a density-insensitive training framework along with a simple flexible voxelization strategy to enhance the model resiliency.
arXiv Detail & Related papers (2023-03-30T17:59:17Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with
Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions.
TransFusion achieves state-of-the-art performance on large-scale datasets.
We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z) - LIF-Seg: LiDAR and Camera Image Fusion for 3D LiDAR Semantic
Segmentation [78.74202673902303]
We propose a coarse-tofine LiDAR and camera fusion-based network (termed as LIF-Seg) for LiDAR segmentation.
The proposed method fully utilizes the contextual information of images and introduces a simple but effective early-fusion strategy.
The cooperation of these two components leads to the success of the effective camera-LiDAR fusion.
arXiv Detail & Related papers (2021-08-17T08:53:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.