RaCalNet: Radar Calibration Network for Sparse-Supervised Metric Depth Estimation
- URL: http://arxiv.org/abs/2506.15560v2
- Date: Sat, 05 Jul 2025 14:04:22 GMT
- Title: RaCalNet: Radar Calibration Network for Sparse-Supervised Metric Depth Estimation
- Authors: Xingrui Qin, Wentao Zhao, Chuan Cao, Yihe Niu, Tianchen Deng, Houcheng Jiang, Rui Guo, Jingchuan Wang,
- Abstract summary: RaCalNet is a novel framework that eliminates the need for dense supervision by using sparse LiDAR to supervise the learning of refined radar measurements.<n>RaCalNet produces depth maps with clear object contours and fine-grained textures, demonstrating superior visual quality compared to state-of-the-art dense-supervised methods.
- Score: 14.466573808593887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dense depth estimation using millimeter-wave radar typically requires dense LiDAR supervision, generated via multi-frame projection and interpolation, for guiding the learning of accurate depth from sparse radar measurements and RGB images. However, this paradigm is both costly and data-intensive. To address this, we propose RaCalNet, a novel framework that eliminates the need for dense supervision by using sparse LiDAR to supervise the learning of refined radar measurements, resulting in a supervision density of merely around 1\% compared to dense-supervised methods. RaCalNet is composed of two key modules. The Radar Recalibration module performs radar point screening and pixel-wise displacement refinement, producing accurate and reliable depth priors from sparse radar inputs. These priors are then used by the Metric Depth Optimization module, which learns to infer scene-level scale priors and fuses them with monocular depth predictions to achieve metrically accurate outputs. This modular design enhances structural consistency and preserves fine-grained geometric details. Despite relying solely on sparse supervision, RaCalNet produces depth maps with clear object contours and fine-grained textures, demonstrating superior visual quality compared to state-of-the-art dense-supervised methods. Quantitatively, it achieves performance comparable to existing methods on the ZJU-4DRadarCam dataset and yields a 34.89\% RMSE reduction in real-world deployment scenarios. We plan to gradually release the code and models in the future at https://github.com/818slam/RaCalNet.git.
Related papers
- TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion [54.46664104437454]
We propose TacoDepth, an efficient and accurate Radar-Camera depth estimation model with one-stage fusion.<n>Specifically, the graph-based Radar structure extractor and the pyramid-based Radar fusion module are designed.<n>Compared with the previous state-of-the-art approach, TacoDepth improves depth accuracy and processing speed by 12.8% and 91.8%.
arXiv Detail & Related papers (2025-04-16T05:25:04Z) - CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation [6.9404362058736995]
This paper introduces a two-stage, end-to-end trainable Confidence-aware Fusion Net (CaFNet) for dense depth estimation.
The first stage addresses radar-specific challenges, such as ambiguous elevation and noisy measurements.
For the final depth estimation, we innovate a confidence-aware gated fusion mechanism to integrate radar and image features effectively.
arXiv Detail & Related papers (2024-06-30T13:39:29Z) - Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian [49.21866794516328]
3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis.
Previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting.
We introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates.
arXiv Detail & Related papers (2024-05-30T03:18:30Z) - RIDERS: Radar-Infrared Depth Estimation for Robust Sensing [22.10378524682712]
Adverse weather conditions pose significant challenges to accurate dense depth estimation.
We present a novel approach for robust metric depth estimation by fusing a millimeter-wave Radar and a monocular infrared thermal camera.
Our method achieves exceptional visual quality and accurate metric estimation by addressing the challenges of ambiguity and misalignment.
arXiv Detail & Related papers (2024-02-03T07:14:43Z) - RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale [21.09258172290667]
We present a novel approach for metric dense depth estimation based on the fusion of a single-view image and a sparse, noisy Radar point cloud.
Our proposed method significantly outperforms the state-of-the-art Radar-Camera depth estimation methods by reducing the mean absolute error (MAE) of depth estimation by 25.6% and 40.2% on the challenging nuScenes dataset and our self-collected ZJU-4DRadarCam dataset, respectively.
arXiv Detail & Related papers (2024-01-09T02:40:03Z) - OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - Sparse Beats Dense: Rethinking Supervision in Radar-Camera Depth Completion [18.0877558432168]
We present a new method with sparse LiDAR supervision to outperform previous dense LiDAR supervision methods in both accuracy and speed.
We find that depth completion models usually output depth maps containing significant stripe-like artifacts when trained by sparse LiDAR supervision.
Our framework with sparse supervision outperforms the state-of-the-art dense supervision methods with 11.6% improvement in Mean Absolute Error (MAE) and 1.6x speedup in Frame Per Second (FPS).
arXiv Detail & Related papers (2023-12-01T06:04:49Z) - Metrically Scaled Monocular Depth Estimation through Sparse Priors for
Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions.
The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea.
The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR [22.202192422883122]
We propose a novel two-stage network to advance the self-supervised monocular dense depth learning.
Our model fuses monocular image features and sparse LiDAR features to predict initial depth maps.
Our model outperforms the state-of-the-art sparse-LiDAR-based method (Pseudo-LiDAR++) by more than 68% for the downstream task monocular 3D object detection.
arXiv Detail & Related papers (2021-09-20T15:28:36Z) - R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of
Dynamic Scenes [69.6715406227469]
Self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches.
We present R4Dyn, a novel set of techniques to use cost-efficient radar data on top of a self-supervised depth estimation framework.
arXiv Detail & Related papers (2021-08-10T17:57:03Z) - Depth Estimation from Monocular Images and Sparse radar using Deep
Ordinal Regression Network [2.0446891814677692]
We integrate sparse radar data into a monocular depth estimation model and introduce a novel preprocessing method for reducing the sparseness and limited field of view provided by radar.
We propose a novel method for estimating dense depth maps from monocular 2D images and sparse radar measurements using deep learning based on the deep ordinal regression network by Fu et al.
arXiv Detail & Related papers (2021-07-15T20:17:48Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z) - Depth Estimation from Monocular Images and Sparse Radar Data [93.70524512061318]
In this paper, we explore the possibility of achieving a more accurate depth estimation by fusing monocular images and Radar points using a deep neural network.
We find that the noise existing in Radar measurements is one of the main key reasons that prevents one from applying the existing fusion methods.
The experiments are conducted on the nuScenes dataset, which is one of the first datasets which features Camera, Radar, and LiDAR recordings in diverse scenes and weather conditions.
arXiv Detail & Related papers (2020-09-30T19:01:33Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.