RC-BEVFusion: A Plug-In Module for Radar-Camera Bird's Eye View Feature
Fusion
- URL: http://arxiv.org/abs/2305.15883v2
- Date: Thu, 28 Sep 2023 08:07:36 GMT
- Title: RC-BEVFusion: A Plug-In Module for Radar-Camera Bird's Eye View Feature
Fusion
- Authors: Lukas St\"acker, Shashank Mishra, Philipp Heidenreich, Jason Rambach,
Didier Stricker
- Abstract summary: We present RC-BEVFusion, a modular radar-camera fusion network on the BEV plane.
We show significant performance gains of up to 28% increase in the nuScenes detection score.
- Score: 11.646949644683755
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Radars and cameras belong to the most frequently used sensors for advanced
driver assistance systems and automated driving research. However, there has
been surprisingly little research on radar-camera fusion with neural networks.
One of the reasons is a lack of large-scale automotive datasets with radar and
unmasked camera data, with the exception of the nuScenes dataset. Another
reason is the difficulty of effectively fusing the sparse radar point cloud on
the bird's eye view (BEV) plane with the dense images on the perspective plane.
The recent trend of camera-based 3D object detection using BEV features has
enabled a new type of fusion, which is better suited for radars. In this work,
we present RC-BEVFusion, a modular radar-camera fusion network on the BEV
plane. We propose BEVFeatureNet, a novel radar encoder branch, and show that it
can be incorporated into several state-of-the-art camera-based architectures.
We show significant performance gains of up to 28% increase in the nuScenes
detection score, which is an important step in radar-camera fusion research.
Without tuning our model for the nuScenes benchmark, we achieve the best result
among all published methods in the radar-camera fusion category.
Related papers
- A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data [7.2508100569856975]
We use the raw range-Doppler spectrum of radar data to process camera images.
We extract the corresponding features with our camera encoder-decoder architecture.
The resultant feature maps are fused with Range-Azimuth features, recovered from the RD spectrum input to perform object detection.
arXiv Detail & Related papers (2024-11-20T13:26:13Z) - RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network [34.45694077040797]
We present a radar-camera fusion 3D object detection framework called BEEVDet.
RadarBEVNet encodes sparse radar points into a dense bird's-eye-view feature.
Our method achieves state-of-the-art radar-camera fusion results in 3D object detection, BEV semantic segmentation, and 3D multi-object tracking tasks.
arXiv Detail & Related papers (2024-09-08T05:14:27Z) - Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar [62.51065633674272]
We introduce Radar Fields - a neural scene reconstruction method designed for active radar imagers.
Our approach unites an explicit, physics-informed sensor model with an implicit neural geometry and reflectance model to directly synthesize raw radar measurements.
We validate the effectiveness of the method across diverse outdoor scenarios, including urban scenes with dense vehicles and infrastructure.
arXiv Detail & Related papers (2024-05-07T20:44:48Z) - RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection [33.07575082922186]
Three-dimensional object detection is one of the key tasks in autonomous driving.
relying solely on cameras is difficult to achieve highly accurate and robust 3D object detection.
radar-camera fusion 3D object detection method in the bird's eye view (BEV)
RadarBEVNet consists of a dual-stream radar backbone and a Radar Cross-Section (RC) aware BEV encoder.
arXiv Detail & Related papers (2024-03-25T06:02:05Z) - Cross-Dataset Experimental Study of Radar-Camera Fusion in Bird's-Eye
View [12.723455775659414]
Radar and camera fusion systems have the potential to provide a highly robust and reliable perception system.
Recent advances in camera-based object detection offer new radar-camera fusion possibilities with bird's eye view feature maps.
We propose a novel and flexible fusion network and evaluate its performance on two datasets.
arXiv Detail & Related papers (2023-09-27T08:02:58Z) - Echoes Beyond Points: Unleashing the Power of Raw Radar Data in
Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline.
Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z) - RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection [15.686167262542297]
We propose Radar-Camera Multi-level fusion (RCM-Fusion), which attempts to fuse both modalities at both feature and instance levels.
For feature-level fusion, we propose a Radar Guided BEV which transforms camera features into precise BEV representations.
For instance-level fusion, we propose a Radar Grid Point Refinement module that reduces localization error.
arXiv Detail & Related papers (2023-07-17T07:22:25Z) - Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object
Detection [78.59426158981108]
We introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects.
We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects.
arXiv Detail & Related papers (2023-06-02T10:57:41Z) - Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images [96.66271207089096]
FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes.
We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
arXiv Detail & Related papers (2022-05-27T05:42:16Z) - Depth Estimation from Monocular Images and Sparse Radar Data [93.70524512061318]
In this paper, we explore the possibility of achieving a more accurate depth estimation by fusing monocular images and Radar points using a deep neural network.
We find that the noise existing in Radar measurements is one of the main key reasons that prevents one from applying the existing fusion methods.
The experiments are conducted on the nuScenes dataset, which is one of the first datasets which features Camera, Radar, and LiDAR recordings in diverse scenes and weather conditions.
arXiv Detail & Related papers (2020-09-30T19:01:33Z) - RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars.
We propose a new solution that exploits both LiDAR and Radar sensors for perception.
Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.