PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency
- URL: http://arxiv.org/abs/2507.07374v1
- Date: Thu, 10 Jul 2025 01:56:30 GMT
- Title: PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency
- Authors: Haotian Wang, Aoran Xiao, Xiaoqin Zhang, Meng Yang, Shijian Lu,
- Abstract summary: PacGDC is a label-efficient technique that enhances data diversity with minimal annotation effort for generalizable depth completion.<n>We propose a new data synthesis pipeline that uses multiple depth foundation models as scale manipulators.<n>Experiments show that PacGDC achieves remarkable generalizability across multiple benchmarks.
- Score: 63.74016242995453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalizable depth completion enables the acquisition of dense metric depth maps for unseen environments, offering robust perception capabilities for various downstream tasks. However, training such models typically requires large-scale datasets with metric depth labels, which are often labor-intensive to collect. This paper presents PacGDC, a label-efficient technique that enhances data diversity with minimal annotation effort for generalizable depth completion. PacGDC builds on novel insights into inherent ambiguities and consistencies in object shapes and positions during 2D-to-3D projection, allowing the synthesis of numerous pseudo geometries for the same visual scene. This process greatly broadens available geometries by manipulating scene scales of the corresponding depth maps. To leverage this property, we propose a new data synthesis pipeline that uses multiple depth foundation models as scale manipulators. These models robustly provide pseudo depth labels with varied scene scales, affecting both local objects and global layouts, while ensuring projection consistency that supports generalization. To further diversify geometries, we incorporate interpolation and relocation strategies, as well as unlabeled images, extending the data coverage beyond the individual use of foundation models. Extensive experiments show that PacGDC achieves remarkable generalizability across multiple benchmarks, excelling in diverse scene semantics/scales and depth sparsity/patterns under both zero-shot and few-shot settings. Code: https://github.com/Wang-xjtu/PacGDC.
Related papers
- Dens3R: A Foundation Model for 3D Geometry Prediction [44.13431776180547]
Dens3R is a 3D foundation model designed for joint geometric dense prediction.<n>By integrating image-pair matching features with intrinsic invariance modeling, Dens3R accurately regresses multiple geometric quantities.
arXiv Detail & Related papers (2025-07-22T07:22:30Z) - Depth Anything with Any Prior [64.39991799606146]
Prior Depth Anything is a framework that combines incomplete but precise metric information in depth measurement with relative but complete geometric structures in depth prediction.<n>We develop a conditioned monocular depth estimation (MDE) model to refine the inherent noise of depth priors.<n>Our model showcases impressive zero-shot generalization across depth completion, super-resolution, and inpainting over 7 real-world datasets.
arXiv Detail & Related papers (2025-05-15T17:59:50Z) - Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image [51.689871870692194]
Metric-r is a novel sliding anchor-based metric depth estimation method.<n>Our design enables a unified and adaptive depth representation across diverse environments.
arXiv Detail & Related papers (2025-04-16T14:12:25Z) - DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.<n>This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.<n>Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - Scale Propagation Network for Generalizable Depth Completion [16.733495588009184]
We propose a novel scale propagation normalization (SP-Norm) method to propagate scales from input to output.
We also develop a new network architecture based on SP-Norm and the ConvNeXt V2 backbone.
Our model consistently achieves the best accuracy with faster speed and lower memory when compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-10-24T03:53:06Z) - Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation [34.786268652516355]
Scene segmentation via unsupervised domain adaptation (UDA) enables the transfer of knowledge acquired from source synthetic data to real-world target data.
We propose a depth-aware framework to explicitly leverage depth estimation to mix the categories and facilitate the two complementary tasks, i.e., segmentation and depth learning.
In particular, the framework contains a Depth-guided Contextual Filter (DCF) forndata augmentation and a cross-task encoder for contextual learning.
arXiv Detail & Related papers (2023-11-21T15:39:21Z) - CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes.
Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes.
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z) - SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via
Swin Transformer and Densely Cascaded Network [29.798579906253696]
It is challenging to acquire dense ground truth depth labels for supervised training, and the unsupervised depth estimation using monocular sequences emerges as a promising alternative.
In this paper, we employ a convolution-free Swin Transformer as an image feature extractor so that the network can capture both local geometric features and global semantic features for depth estimation.
Also, we propose a Densely Cascaded Multi-scale Network (DCMNet) that connects every feature map directly with another from different scales via a top-down cascade pathway.
arXiv Detail & Related papers (2023-01-17T06:01:46Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Towards Domain-agnostic Depth Completion [28.25756709062647]
Existing depth completion methods are often targeted at a specific sparse depth type and generalize poorly across task domains.
We present a method to complete sparse/semi-dense, noisy, and potentially low-resolution depth maps obtained by various range sensors.
Our method shows superior cross-domain generalization ability against state-of-the-art depth completion methods.
arXiv Detail & Related papers (2022-07-29T04:10:22Z) - Depth Completion using Geometry-Aware Embedding [22.333381291860498]
This paper proposes an efficient method to learn geometry-aware embedding.
It encodes the local and global geometric structure information from 3D points, e.g., scene layout, object's sizes and shapes, to guide dense depth estimation.
arXiv Detail & Related papers (2022-03-21T12:06:27Z) - DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data [110.29043712400912]
We present a method for depth estimation with monocular images, which can predict high-quality depth on diverse scenes up to an affine transformation.
Experiments show that our method outperforms previous methods on 8 datasets by a large margin with the zero-shot test setting.
arXiv Detail & Related papers (2020-02-03T05:38:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.