RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale
- URL: http://arxiv.org/abs/2401.04325v2
- Date: Tue, 19 Mar 2024 04:45:47 GMT
- Title: RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale
- Authors: Han Li, Yukai Ma, Yaqing Gu, Kewei Hu, Yong Liu, Xingxing Zuo,
- Abstract summary: We present a novel approach for metric dense depth estimation based on the fusion of a single-view image and a sparse, noisy Radar point cloud.
Our proposed method significantly outperforms the state-of-the-art Radar-Camera depth estimation methods by reducing the mean absolute error (MAE) of depth estimation by 25.6% and 40.2% on the challenging nuScenes dataset and our self-collected ZJU-4DRadarCam dataset, respectively.
- Score: 21.09258172290667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel approach for metric dense depth estimation based on the fusion of a single-view image and a sparse, noisy Radar point cloud. The direct fusion of heterogeneous Radar and image data, or their encodings, tends to yield dense depth maps with significant artifacts, blurred boundaries, and suboptimal accuracy. To circumvent this issue, we learn to augment versatile and robust monocular depth prediction with the dense metric scale induced from sparse and noisy Radar data. We propose a Radar-Camera framework for highly accurate and fine-detailed dense depth estimation with four stages, including monocular depth prediction, global scale alignment of monocular depth with sparse Radar points, quasi-dense scale estimation through learning the association between Radar points and image patches, and local scale refinement of dense depth using a scale map learner. Our proposed method significantly outperforms the state-of-the-art Radar-Camera depth estimation methods by reducing the mean absolute error (MAE) of depth estimation by 25.6% and 40.2% on the challenging nuScenes dataset and our self-collected ZJU-4DRadarCam dataset, respectively. Our code and dataset will be released at \url{https://github.com/MMOCKING/RadarCam-Depth}.
Related papers
- GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling [7.90238039959534]
Existing algorithms process radar data by projecting 3D points onto the image plane for pixel-level feature extraction.
We propose GET-UP, leveraging attention-enhanced Graph Neural Networks (GNN) to exchange and aggregate both 2D and 3D information from radar data.
We benchmark our proposed GET-UP on the nuScenes dataset, achieving state-of-the-art performance with a 15.3% and 14.7% improvement in MAE and RMSE over the previously best-performing model.
arXiv Detail & Related papers (2024-09-02T14:15:09Z) - CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation [6.9404362058736995]
This paper introduces a two-stage, end-to-end trainable Confidence-aware Fusion Net (CaFNet) for dense depth estimation.
The first stage addresses radar-specific challenges, such as ambiguous elevation and noisy measurements.
For the final depth estimation, we innovate a confidence-aware gated fusion mechanism to integrate radar and image features effectively.
arXiv Detail & Related papers (2024-06-30T13:39:29Z) - RIDERS: Radar-Infrared Depth Estimation for Robust Sensing [22.10378524682712]
Adverse weather conditions pose significant challenges to accurate dense depth estimation.
We present a novel approach for robust metric depth estimation by fusing a millimeter-wave Radar and a monocular infrared thermal camera.
Our method achieves exceptional visual quality and accurate metric estimation by addressing the challenges of ambiguity and misalignment.
arXiv Detail & Related papers (2024-02-03T07:14:43Z) - Echoes Beyond Points: Unleashing the Power of Raw Radar Data in
Multi-modality Fusion [74.84019379368807]
We propose a novel method named EchoFusion to skip the existing radar signal processing pipeline.
Specifically, we first generate the Bird's Eye View (BEV) queries and then take corresponding spectrum features from radar to fuse with other sensors.
arXiv Detail & Related papers (2023-07-31T09:53:50Z) - Monocular Visual-Inertial Depth Estimation [66.71452943981558]
We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry.
Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment.
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in RMSE with dense scale alignment.
arXiv Detail & Related papers (2023-03-21T18:47:34Z) - R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of
Dynamic Scenes [69.6715406227469]
Self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches.
We present R4Dyn, a novel set of techniques to use cost-efficient radar data on top of a self-supervised depth estimation framework.
arXiv Detail & Related papers (2021-08-10T17:57:03Z) - Depth Estimation from Monocular Images and Sparse radar using Deep
Ordinal Regression Network [2.0446891814677692]
We integrate sparse radar data into a monocular depth estimation model and introduce a novel preprocessing method for reducing the sparseness and limited field of view provided by radar.
We propose a novel method for estimating dense depth maps from monocular 2D images and sparse radar measurements using deep learning based on the deep ordinal regression network by Fu et al.
arXiv Detail & Related papers (2021-07-15T20:17:48Z) - Multi-Modal Depth Estimation Using Convolutional Neural Networks [0.8701566919381223]
This paper addresses the problem of dense depth predictions from sparse distance sensor data and a single camera image on challenging weather conditions.
It explores the significance of different sensor modalities such as camera, Radar, and Lidar for estimating depth by applying Deep Learning approaches.
arXiv Detail & Related papers (2020-12-17T15:31:49Z) - LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar
Fusion [52.59664614744447]
We present LiRaNet, a novel end-to-end trajectory prediction method which utilizes radar sensor information along with widely used lidar and high definition (HD) maps.
automotive radar provides rich, complementary information, allowing for longer range vehicle detection as well as instantaneous velocity measurements.
arXiv Detail & Related papers (2020-10-02T00:13:00Z) - Depth Estimation from Monocular Images and Sparse Radar Data [93.70524512061318]
In this paper, we explore the possibility of achieving a more accurate depth estimation by fusing monocular images and Radar points using a deep neural network.
We find that the noise existing in Radar measurements is one of the main key reasons that prevents one from applying the existing fusion methods.
The experiments are conducted on the nuScenes dataset, which is one of the first datasets which features Camera, Radar, and LiDAR recordings in diverse scenes and weather conditions.
arXiv Detail & Related papers (2020-09-30T19:01:33Z) - RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars.
We propose a new solution that exploits both LiDAR and Radar sensors for perception.
Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.