Structure-Aware Radar-Camera Depth Estimation
- URL: http://arxiv.org/abs/2506.05008v3
- Date: Sun, 29 Jun 2025 03:21:04 GMT
- Title: Structure-Aware Radar-Camera Depth Estimation
- Authors: Fuyi Zhang, Zhu Yu, Chunhao Li, Runmin Zhang, Xiaokai Bai, Zili Zhou, Si-Yuan Cao, Fang Wang, Hui-Liang Shen,
- Abstract summary: We propose a structure-aware radar-camera depth estimation framework, named SA-RCD.<n>Our SA-RCD achieves state-of-the-art performance on the nuScenes dataset.
- Score: 10.13373000424379
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Radar has gained much attention in autonomous driving due to its accessibility and robustness. However, its standalone application for depth perception is constrained by issues of sparsity and noise. Radar-camera depth estimation offers a more promising complementary solution. Despite significant progress, current approaches fail to produce satisfactory dense depth maps, due to the unsatisfactory processing of the sparse and noisy radar data. They constrain the regions of interest for radar points in rigid rectangular regions, which may introduce unexpected errors and confusions. To address these issues, we develop a structure-aware strategy for radar depth enhancement, which provides more targeted regions of interest by leveraging the structural priors of RGB images. Furthermore, we design a Multi-Scale Structure Guided Network to enhance radar features and preserve detailed structures, achieving accurate and structure-detailed dense metric depth estimation. Building on these, we propose a structure-aware radar-camera depth estimation framework, named SA-RCD. Extensive experiments demonstrate that our SA-RCD achieves state-of-the-art performance on the nuScenes dataset. Our code will be available at https://github.com/FreyZhangYeh/SA-RCD.
Related papers
- TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion [54.46664104437454]
We propose TacoDepth, an efficient and accurate Radar-Camera depth estimation model with one-stage fusion.<n>Specifically, the graph-based Radar structure extractor and the pyramid-based Radar fusion module are designed.<n>Compared with the previous state-of-the-art approach, TacoDepth improves depth accuracy and processing speed by 12.8% and 91.8%.
arXiv Detail & Related papers (2025-04-16T05:25:04Z) - RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance.<n>Radar suffers from noise and positional ambiguity.<n>We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z) - DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.<n>This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.<n>Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - A Simple yet Effective Test-Time Adaptation for Zero-Shot Monocular Metric Depth Estimation [46.037640130193566]
We propose a new method to rescale Depth Anything predictions using 3D points provided by sensors or techniques such as low-resolution LiDAR or structure-from-motion with poses given by an IMU.<n>Our experiments highlight enhancements relative to zero-shot monocular metric depth estimation methods, competitive results compared to fine-tuned approaches and a better robustness than depth completion approaches.
arXiv Detail & Related papers (2024-12-18T17:50:15Z) - Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion [51.69876947593144]
Existing methods for depth completion operate in tightly constrained settings.<n>Inspired by advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation.<n>Marigold-DC builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance.
arXiv Detail & Related papers (2024-12-18T00:06:41Z) - GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling [7.90238039959534]
Existing algorithms process radar data by projecting 3D points onto the image plane for pixel-level feature extraction.
We propose GET-UP, leveraging attention-enhanced Graph Neural Networks (GNN) to exchange and aggregate both 2D and 3D information from radar data.
We benchmark our proposed GET-UP on the nuScenes dataset, achieving state-of-the-art performance with a 15.3% and 14.7% improvement in MAE and RMSE over the previously best-performing model.
arXiv Detail & Related papers (2024-09-02T14:15:09Z) - CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation [6.9404362058736995]
This paper introduces a two-stage, end-to-end trainable Confidence-aware Fusion Net (CaFNet) for dense depth estimation.
The first stage addresses radar-specific challenges, such as ambiguous elevation and noisy measurements.
For the final depth estimation, we innovate a confidence-aware gated fusion mechanism to integrate radar and image features effectively.
arXiv Detail & Related papers (2024-06-30T13:39:29Z) - RIDERS: Radar-Infrared Depth Estimation for Robust Sensing [22.10378524682712]
Adverse weather conditions pose significant challenges to accurate dense depth estimation.
We present a novel approach for robust metric depth estimation by fusing a millimeter-wave Radar and a monocular infrared thermal camera.
Our method achieves exceptional visual quality and accurate metric estimation by addressing the challenges of ambiguity and misalignment.
arXiv Detail & Related papers (2024-02-03T07:14:43Z) - RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale [21.09258172290667]
We present a novel approach for metric dense depth estimation based on the fusion of a single-view image and a sparse, noisy Radar point cloud.
Our proposed method significantly outperforms the state-of-the-art Radar-Camera depth estimation methods by reducing the mean absolute error (MAE) of depth estimation by 25.6% and 40.2% on the challenging nuScenes dataset and our self-collected ZJU-4DRadarCam dataset, respectively.
arXiv Detail & Related papers (2024-01-09T02:40:03Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - Self-Supervised Learning based Depth Estimation from Monocular Images [0.0]
The goal of Monocular Depth Estimation is to predict the depth map, given a 2D monocular RGB image as input.
We plan to do intrinsic camera parameters during training and apply weather augmentations to further generalize our model.
arXiv Detail & Related papers (2023-04-14T07:14:08Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of
Dynamic Scenes [69.6715406227469]
Self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches.
We present R4Dyn, a novel set of techniques to use cost-efficient radar data on top of a self-supervised depth estimation framework.
arXiv Detail & Related papers (2021-08-10T17:57:03Z) - RigNet: Repetitive Image Guided Network for Depth Completion [20.66405067066299]
Recent approaches mainly focus on image guided learning to predict dense results.
blurry image guidance and object structures in depth still impede the performance of image guided frameworks.
We explore a repetitive design in our image guided network to sufficiently and gradually recover depth values.
Our method achieves state-of-the-art result on the NYUv2 dataset and ranks 1st on the KITTI benchmark at the time of submission.
arXiv Detail & Related papers (2021-07-29T08:00:33Z) - Rethinking of Radar's Role: A Camera-Radar Dataset and Systematic
Annotator via Coordinate Alignment [38.24705460170415]
We propose a new dataset, named CRUW, with a systematic annotator and performance evaluation system.
CRUW aims to classify and localize the objects in 3D purely from radar's radio frequency (RF) images.
To the best of our knowledge, CRUW is the first public large-scale dataset with a systematic annotation and evaluation system.
arXiv Detail & Related papers (2021-05-11T17:13:45Z) - ADAADepth: Adapting Data Augmentation and Attention for Self-Supervised
Monocular Depth Estimation [8.827921242078881]
We propose ADAA, utilising depth augmentation as depth supervision for learning accurate and robust depth.
We propose a relational self-attention module that learns rich contextual features and further enhances depth results.
We evaluate our predicted depth on the KITTI driving dataset and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-03-01T09:06:55Z) - LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar
Fusion [52.59664614744447]
We present LiRaNet, a novel end-to-end trajectory prediction method which utilizes radar sensor information along with widely used lidar and high definition (HD) maps.
automotive radar provides rich, complementary information, allowing for longer range vehicle detection as well as instantaneous velocity measurements.
arXiv Detail & Related papers (2020-10-02T00:13:00Z) - RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars.
We propose a new solution that exploits both LiDAR and Radar sensors for perception.
Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.