Related papers: Training Self-Supervised Depth Completion Using Sparse Measurements and a Single Image

Training Self-Supervised Depth Completion Using Sparse Measurements and a Single Image

URL: http://arxiv.org/abs/2507.14845v1
Date: Sun, 20 Jul 2025 07:24:09 GMT
Title: Training Self-Supervised Depth Completion Using Sparse Measurements and a Single Image
Authors: Rizhao Fan, Zhigen Li, Heping Li, Ning An,
Abstract summary: We propose a novel self-supervised depth completion paradigm that requires only sparse depth measurements and their corresponding image for training.<n>By leveraging the characteristics of depth distribution, we design novel loss functions that effectively propagate depth information from observed points to unobserved regions.
Score: 2.3874115898130865
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Depth completion is an important vision task, and many efforts have been made to enhance the quality of depth maps from sparse depth measurements. Despite significant advances, training these models to recover dense depth from sparse measurements remains a challenging problem. Supervised learning methods rely on dense depth labels to predict unobserved regions, while self-supervised approaches require image sequences to enforce geometric constraints and photometric consistency between frames. However, acquiring dense annotations is costly, and multi-frame dependencies limit the applicability of self-supervised methods in static or single-frame scenarios. To address these challenges, we propose a novel self-supervised depth completion paradigm that requires only sparse depth measurements and their corresponding image for training. Unlike existing methods, our approach eliminates the need for dense depth labels or additional images captured from neighboring viewpoints. By leveraging the characteristics of depth distribution, we design novel loss functions that effectively propagate depth information from observed points to unobserved regions. Additionally, we incorporate segmentation maps generated by vision foundation models to further enhance depth estimation. Extensive experiments demonstrate the effectiveness of our proposed method.

Related papers

Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion [51.69876947593144]
Existing methods for depth completion operate in tightly constrained settings.<n>Inspired by advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation.<n>Marigold-DC builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance.
arXiv Detail & Related papers (2024-12-18T00:06:41Z)
Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework. We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas. With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z)
SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions. We introduce an external pretrained monocular depth estimation model for generating single-image depth prior. Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z)
Learning Occlusion-Aware Coarse-to-Fine Depth Map for Self-supervised Monocular Depth Estimation [11.929584800629673]
We propose a novel network to learn an Occlusion-aware Coarse-to-Fine Depth map for self-supervised monocular depth estimation. The proposed OCFD-Net does not only employ a discrete depth constraint for learning a coarse-level depth map, but also employ a continuous depth constraint for learning a scene depth residual.
arXiv Detail & Related papers (2022-03-21T12:43:42Z)
An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements. We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z)
Self-Guided Instance-Aware Network for Depth Completion and Enhancement [6.319531161477912]
Existing methods directly interpolate the missing depth measurements based on pixel-wise image content and the corresponding neighboring depth values. We propose a novel self-guided instance-aware network (SG-IANet) that utilize self-guided mechanism to extract instance-level features that is needed for depth restoration.
arXiv Detail & Related papers (2021-05-25T19:41:38Z)
Progressive Depth Learning for Single Image Dehazing [56.71963910162241]
Existing dehazing methods often ignore the depth cues and fail in distant areas where heavier haze disturbs the visibility. We propose a deep end-to-end model that iteratively estimates image depths and transmission maps. Our approach benefits from explicitly modeling the inner relationship of image depth and transmission map, which is especially effective for distant hazy areas.
arXiv Detail & Related papers (2021-02-21T05:24:18Z)
Semantic-Guided Representation Enhancement for Self-supervised Monocular Trained Depth Estimation [39.845944724079814]
Self-supervised depth estimation has shown its great effectiveness in producing high quality depth maps given only image sequences as input. However, its performance usually drops when estimating on border areas or objects with thin structures due to the limited depth representation ability. We propose a semantic-guided depth representation enhancement method, which promotes both local and global depth feature representations.
arXiv Detail & Related papers (2020-12-15T02:24:57Z)
SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction [27.750031877854717]
We propose SAFENet that is designed to leverage semantic information to overcome the limitations of the photometric loss. Our key idea is to exploit semantic-aware depth features that integrate the semantic and geometric knowledge. Experiments on KITTI dataset demonstrate that our methods compete or even outperform the state-of-the-art methods.
arXiv Detail & Related papers (2020-10-06T17:22:25Z)
Adaptive confidence thresholding for monocular depth estimation [83.06265443599521]
We propose a new approach to leverage pseudo ground truth depth maps of stereo images generated from self-supervised stereo matching methods. The confidence map of the pseudo ground truth depth map is estimated to mitigate performance degeneration by inaccurate pseudo depth maps. Experimental results demonstrate superior performance to state-of-the-art monocular depth estimation methods.
arXiv Detail & Related papers (2020-09-27T13:26:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.