Related papers: RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection

RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection

URL: http://arxiv.org/abs/2307.10249v5
Date: Thu, 16 May 2024 05:16:37 GMT
Title: RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection
Authors: Jisong Kim, Minjae Seong, Geonho Bang, Dongsuk Kum, Jun Won Choi,
Abstract summary: We propose Radar-Camera Multi-level fusion (RCM-Fusion), which attempts to fuse both modalities at both feature and instance levels. For feature-level fusion, we propose a Radar Guided BEV which transforms camera features into precise BEV representations. For instance-level fusion, we propose a Radar Grid Point Refinement module that reduces localization error.
Score: 15.686167262542297
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While LiDAR sensors have been successfully applied to 3D object detection, the affordability of radar and camera sensors has led to a growing interest in fusing radars and cameras for 3D object detection. However, previous radar-camera fusion models were unable to fully utilize the potential of radar information. In this paper, we propose Radar-Camera Multi-level fusion (RCM-Fusion), which attempts to fuse both modalities at both feature and instance levels. For feature-level fusion, we propose a Radar Guided BEV Encoder which transforms camera features into precise BEV representations using the guidance of radar Bird's-Eye-View (BEV) features and combines the radar and camera BEV features. For instance-level fusion, we propose a Radar Grid Point Refinement module that reduces localization error by accounting for the characteristics of the radar point clouds. The experiments conducted on the public nuScenes dataset demonstrate that our proposed RCM-Fusion achieves state-of-the-art performances among single frame-based radar-camera fusion methods in the nuScenes 3D object detection benchmark. Code will be made publicly available.

Related papers

RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance. Radar suffers from noise and positional ambiguity. We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z)
A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data [7.2508100569856975]
We use the raw range-Doppler spectrum of radar data to process camera images. We extract the corresponding features with our camera encoder-decoder architecture. The resultant feature maps are fused with Range-Azimuth features, recovered from the RD spectrum input to perform object detection.
arXiv Detail & Related papers (2024-11-20T13:26:13Z)
RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network [34.45694077040797]
We present a radar-camera fusion 3D object detection framework called BEEVDet. RadarBEVNet encodes sparse radar points into a dense bird's-eye-view feature. Our method achieves state-of-the-art radar-camera fusion results in 3D object detection, BEV semantic segmentation, and 3D multi-object tracking tasks.
arXiv Detail & Related papers (2024-09-08T05:14:27Z)
RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection [33.07575082922186]
Three-dimensional object detection is one of the key tasks in autonomous driving. relying solely on cameras is difficult to achieve highly accurate and robust 3D object detection. radar-camera fusion 3D object detection method in the bird's eye view (BEV) RadarBEVNet consists of a dual-stream radar backbone and a Radar Cross-Section (RC) aware BEV encoder.
arXiv Detail & Related papers (2024-03-25T06:02:05Z)
Echoes Beyond Points: Unleashing the Power of Raw Radar Data in Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline. Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z)
Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection [78.59426158981108]
We introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects. We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects.
arXiv Detail & Related papers (2023-06-02T10:57:41Z)
RC-BEVFusion: A Plug-In Module for Radar-Camera Bird's Eye View Feature Fusion [11.646949644683755]
We present RC-BEVFusion, a modular radar-camera fusion network on the BEV plane. We show significant performance gains of up to 28% increase in the nuScenes detection score.
arXiv Detail & Related papers (2023-05-25T09:26:04Z)
Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection. With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z)
MVFusion: Multi-View 3D Object Detection with Semantic-aligned Radar and Camera Fusion [6.639648061168067]
Multi-view radar-camera fused 3D object detection provides a farther detection range and more helpful features for autonomous driving. Current radar-camera fusion methods deliver kinds of designs to fuse radar information with camera data. We present MVFusion, a novel Multi-View radar-camera Fusion method to achieve semantic-aligned radar features.
arXiv Detail & Related papers (2023-02-21T08:25:50Z)
Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images [96.66271207089096]
FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes. We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
arXiv Detail & Related papers (2022-05-27T05:42:16Z)
Depth Estimation from Monocular Images and Sparse Radar Data [93.70524512061318]
In this paper, we explore the possibility of achieving a more accurate depth estimation by fusing monocular images and Radar points using a deep neural network. We find that the noise existing in Radar measurements is one of the main key reasons that prevents one from applying the existing fusion methods. The experiments are conducted on the nuScenes dataset, which is one of the first datasets which features Camera, Radar, and LiDAR recordings in diverse scenes and weather conditions.
arXiv Detail & Related papers (2020-09-30T19:01:33Z)
RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars. We propose a new solution that exploits both LiDAR and Radar sensors for perception. Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.