Related papers: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems

Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems

URL: http://arxiv.org/abs/2512.08229v1
Date: Tue, 09 Dec 2025 04:14:05 GMT
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems
Authors: Tony Salloom, Dandi Zhou, Xinhai Sun,
Abstract summary: RGB-D and stereo vision sensors are widely used for industrial robotic systems that perform manipulation, inspection, and navigation tasks.<n>Current depth completion methods produce noisy, incomplete, or biased depth maps due to sensor limitations and environmental conditions.<n>We propose a normal-guided sparse depth sampling strategy that leverages PCA-based surface normal estimation on the RGB-D point cloud to compute a per-pixel depth reliability measure.<n> Experiments show that our geometry-aware sparse depth improves accuracy, reduces artifacts near edges and discontinuities, and produces more realistic training conditions.
Score: 0.20999222360659608
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate three-dimensional perception is essential for modern industrial robotic systems that perform manipulation, inspection, and navigation tasks. RGB-D and stereo vision sensors are widely used for this purpose, but the depth maps they produce are often noisy, incomplete, or biased due to sensor limitations and environmental conditions. Depth completion methods aim to generate dense, reliable depth maps from RGB images and sparse depth input. However, a key limitation in current depth completion pipelines is the unrealistic generation of sparse depth: sparse pixels are typically selected uniformly at random from dense ground-truth depth, ignoring the fact that real sensors exhibit geometry-dependent and spatially nonuniform reliability. In this work, we propose a normal-guided sparse depth sampling strategy that leverages PCA-based surface normal estimation on the RGB-D point cloud to compute a per-pixel depth reliability measure. The sparse depth samples are then drawn according to this reliability distribution. We integrate this sampling method with the Marigold-DC diffusion-based depth completion model and evaluate it on NYU Depth v2 using the standard metrics. Experiments show that our geometry-aware sparse depth improves accuracy, reduces artifacts near edges and discontinuities, and produces more realistic training conditions that better reflect real sensor behavior.

Related papers

Masked Depth Modeling for Spatial Perception [44.0326843862591]
LingBot-Depth is a depth completion model that refines depth maps through masked depth modeling.<n>It outperforms top-tier RGB-D cameras in terms of both depth precision and pixel coverage.<n>We release the code, checkpoint, and 3M RGB-depth pairs to the community of spatial perception.
arXiv Detail & Related papers (2026-01-25T16:13:49Z)
FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers [91.59069344768858]
We introduce Frequency-aware Positional Depth Embedding (FreqPDE) to equip 2D image features with spatial information for 3D detection transformer decoder.<n>FreqPDE combines the 2D image features and 3D position embeddings to generate 3D depth-aware features for query decoding.
arXiv Detail & Related papers (2025-10-17T07:36:54Z)
Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion [57.08169927189237]
Existing methods for depth completion operate in tightly constrained settings.<n>Inspired by advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation.<n>Marigold-DC builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance.
arXiv Detail & Related papers (2024-12-18T00:06:41Z)
Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors. Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors. To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z)
RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion [28.634851863097953]
We propose a novel two-branch end-to-end fusion network named RDFC-GAN. It takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map. The first branch employs an encoder-decoder structure, by adhering to the Manhattan world assumption. The other branch applies an RGB-depth fusion CycleGAN, adept at translating RGB imagery into detailed, textured depth maps.
arXiv Detail & Related papers (2023-06-06T11:03:05Z)
Symmetric Uncertainty-Aware Feature Transmission for Depth Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR. Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z)
RGB-Depth Fusion GAN for Indoor Depth Completion [29.938869342958125]
In this paper, we design a novel two-branch end-to-end fusion network, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map. In one branch, we propose an RGB-depth fusion GAN to transfer the RGB image to the fine-grained textured depth map. In the other branch, we adopt adaptive fusion modules named W-AdaIN to propagate the features across the two branches.
arXiv Detail & Related papers (2022-03-21T10:26:38Z)
Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD) Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z)
Sparse Depth Completion with Semantic Mesh Deformation Optimization [4.03103540543081]
We propose a neural network with post-optimization, which takes an RGB image and sparse depth samples as input and predicts the complete depth map. Our evaluation results outperform the existing work consistently on both indoor and outdoor datasets.
arXiv Detail & Related papers (2021-12-10T13:01:06Z)
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars. In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors. We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.