LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection
- URL: http://arxiv.org/abs/2402.11735v1
- Date: Sun, 18 Feb 2024 23:29:28 GMT
- Title: LiRaFusion: Deep Adaptive LiDAR-Radar Fusion for 3D Object Detection
- Authors: Jingyu Song, Lingjun Zhao, Katherine A. Skinner
- Abstract summary: We propose LiRaFusion to tackle LiDAR-radar fusion for 3D object detection.
We design an early fusion module for joint voxel feature encoding, and a middle fusion module to adaptively fuse feature maps.
We perform extensive evaluation on nuScenes to demonstrate that LiRaFusion achieves notable improvement over existing methods.
- Score: 7.505655376776177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose LiRaFusion to tackle LiDAR-radar fusion for 3D object detection to
fill the performance gap of existing LiDAR-radar detectors. To improve the
feature extraction capabilities from these two modalities, we design an early
fusion module for joint voxel feature encoding, and a middle fusion module to
adaptively fuse feature maps via a gated network. We perform extensive
evaluation on nuScenes to demonstrate that LiRaFusion leverages the
complementary information of LiDAR and radar effectively and achieves notable
improvement over existing methods.
Related papers
- Multistream Network for LiDAR and Camera-based 3D Object Detection in Outdoor Scenes [59.78696921486972]
Fusion of LiDAR and RGB data has the potential to enhance outdoor 3D object detection accuracy.<n>We propose a MultiStream Detection (MuStD) network, that meticulously extracts task-relevant information from both data modalities.
arXiv Detail & Related papers (2025-07-25T14:20:16Z) - A Multimodal Hybrid Late-Cascade Fusion Network for Enhanced 3D Object Detection [6.399439052541506]
We present a new way to detect 3D objects from multimodal inputs, leveraging both LiDAR and RGB cameras in a hybrid late-cascade scheme.
We exploit late fusion principles to reduce LiDAR False Positives, matching LiDAR detections with RGB ones by projecting the LiDAR bounding boxes on the image.
We evaluate our results on the KITTI object detection benchmark, showing significant performance improvements.
arXiv Detail & Related papers (2025-04-25T15:28:53Z) - LiCROcc: Teach Radar for Accurate Semantic Occupancy Prediction using LiDAR and Camera [22.974481709303927]
3D radar is gradually replacing LiDAR in autonomous driving applications.
We propose a three-stage tight fusion approach on BEV to realize a fusion framework for point clouds and images.
Our approach enhances the performance in both radar-only (R-LiCROcc) and radar-camera (RC-LiCROcc) settings.
arXiv Detail & Related papers (2024-07-23T05:53:05Z) - Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving [58.16024314532443]
We introduce LaserMix++, a framework that integrates laser beam manipulations from disparate LiDAR scans and incorporates LiDAR-camera correspondences to assist data-efficient learning.
Results demonstrate that LaserMix++ outperforms fully supervised alternatives, achieving comparable accuracy with five times fewer annotations.
This substantial advancement underscores the potential of semi-supervised approaches in reducing the reliance on extensive labeled data in LiDAR-based 3D scene understanding systems.
arXiv Detail & Related papers (2024-05-08T17:59:53Z) - Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images.
In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data.
We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z) - SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection [22.683446326326898]
SupFusion provides auxiliary feature-level supervision for effective LiDAR-Camera fusion.
Deep fusion module contiguously gains superior performance compared with previous fusion methods.
We gain around 2% 3D mAP improvements on KITTI benchmark based on multiple LiDAR-Camera 3D detectors.
arXiv Detail & Related papers (2023-09-13T16:52:23Z) - RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection [15.686167262542297]
We propose Radar-Camera Multi-level fusion (RCM-Fusion), which attempts to fuse both modalities at both feature and instance levels.
For feature-level fusion, we propose a Radar Guided BEV which transforms camera features into precise BEV representations.
For instance-level fusion, we propose a Radar Grid Point Refinement module that reduces localization error.
arXiv Detail & Related papers (2023-07-17T07:22:25Z) - Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object
Detection [78.59426158981108]
We introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects.
We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects.
arXiv Detail & Related papers (2023-06-02T10:57:41Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with
Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions.
TransFusion achieves state-of-the-art performance on large-scale datasets.
We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z) - Dense Voxel Fusion for 3D Object Detection [10.717415797194896]
Voxel Fusion (DVF) is a sequential fusion method that generates multi-scale dense voxel feature representations.
We train directly with ground truth 2D bounding box labels, avoiding noisy, detector-specific, 2D predictions.
We show that our proposed multi-modal training strategy results in better generalization compared to training using erroneous 2D predictions.
arXiv Detail & Related papers (2022-03-02T04:51:31Z) - Volumetric Propagation Network: Stereo-LiDAR Fusion for Long-Range Depth
Estimation [81.08111209632501]
We propose a geometry-aware stereo-LiDAR fusion network for long-range depth estimation.
We exploit sparse and accurate point clouds as a cue for guiding correspondences of stereo images in a unified 3D volume space.
Our network achieves state-of-the-art performance on the KITTI and the Virtual- KITTI datasets.
arXiv Detail & Related papers (2021-03-24T03:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.