LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection
- URL: http://arxiv.org/abs/2402.11735v1
- Date: Sun, 18 Feb 2024 23:29:28 GMT
- Title: LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection
- Authors: Jingyu Song, Lingjun Zhao, Katherine A. Skinner
- Abstract summary: We propose LiRaFusion to tackle LiDAR-radar fusion for 3D object detection.
We design an early fusion module for joint voxel feature encoding, and a middle fusion module to adaptively fuse feature maps.
We perform extensive evaluation on nuScenes to demonstrate that LiRaFusion achieves notable improvement over existing methods.
- Score: 7.505655376776177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose LiRaFusion to tackle LiDAR-radar fusion for 3D object detection to
fill the performance gap of existing LiDAR-radar detectors. To improve the
feature extraction capabilities from these two modalities, we design an early
fusion module for joint voxel feature encoding, and a middle fusion module to
adaptively fuse feature maps via a gated network. We perform extensive
evaluation on nuScenes to demonstrate that LiRaFusion leverages the
complementary information of LiDAR and radar effectively and achieves notable
improvement over existing methods.
Related papers
- LiCROcc: Teach Radar for Accurate Semantic Occupancy Prediction using LiDAR and Camera [22.974481709303927]
3D radar is gradually replacing LiDAR in autonomous driving applications.
We propose a three-stage tight fusion approach on BEV to realize a fusion framework for point clouds and images.
Our approach enhances the performance in both radar-only (R-LiCROcc) and radar-camera (RC-LiCROcc) settings.
arXiv Detail & Related papers (2024-07-23T05:53:05Z) - Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving [58.16024314532443]
We introduce LaserMix++, a framework that integrates laser beam manipulations from disparate LiDAR scans and incorporates LiDAR-camera correspondences to assist data-efficient learning.
Results demonstrate that LaserMix++ outperforms fully supervised alternatives, achieving comparable accuracy with five times fewer annotations.
This substantial advancement underscores the potential of semi-supervised approaches in reducing the reliance on extensive labeled data in LiDAR-based 3D scene understanding systems.
arXiv Detail & Related papers (2024-05-08T17:59:53Z) - Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images.
In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data.
We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z) - SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection [22.683446326326898]
SupFusion provides auxiliary feature-level supervision for effective LiDAR-Camera fusion.
Deep fusion module contiguously gains superior performance compared with previous fusion methods.
We gain around 2% 3D mAP improvements on KITTI benchmark based on multiple LiDAR-Camera 3D detectors.
arXiv Detail & Related papers (2023-09-13T16:52:23Z) - RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection [15.686167262542297]
We propose Radar-Camera Multi-level fusion (RCM-Fusion), which attempts to fuse both modalities at both feature and instance levels.
For feature-level fusion, we propose a Radar Guided BEV which transforms camera features into precise BEV representations.
For instance-level fusion, we propose a Radar Grid Point Refinement module that reduces localization error.
arXiv Detail & Related papers (2023-07-17T07:22:25Z) - Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object
Detection [78.59426158981108]
We introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects.
We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects.
arXiv Detail & Related papers (2023-06-02T10:57:41Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - Dense Voxel Fusion for 3D Object Detection [10.717415797194896]
Voxel Fusion (DVF) is a sequential fusion method that generates multi-scale dense voxel feature representations.
We train directly with ground truth 2D bounding box labels, avoiding noisy, detector-specific, 2D predictions.
We show that our proposed multi-modal training strategy results in better generalization compared to training using erroneous 2D predictions.
arXiv Detail & Related papers (2022-03-02T04:51:31Z) - Perception-aware Multi-sensor Fusion for 3D LiDAR Semantic Segmentation [59.42262859654698]
3D semantic segmentation is important in scene understanding for many applications, such as auto-driving and robotics.
Existing fusion-based methods may not achieve promising performance due to vast difference between two modalities.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Volumetric Propagation Network: Stereo-LiDAR Fusion for Long-Range Depth
Estimation [81.08111209632501]
We propose a geometry-aware stereo-LiDAR fusion network for long-range depth estimation.
We exploit sparse and accurate point clouds as a cue for guiding correspondences of stereo images in a unified 3D volume space.
Our network achieves state-of-the-art performance on the KITTI and the Virtual- KITTI datasets.
arXiv Detail & Related papers (2021-03-24T03:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.