Related papers: Learning Pixel-wise Continuous Depth Representation via Clustering for Depth Completion

Learning Pixel-wise Continuous Depth Representation via Clustering for Depth Completion

URL: http://arxiv.org/abs/2402.13579v1
Date: Wed, 21 Feb 2024 07:18:23 GMT
Title: Learning Pixel-wise Continuous Depth Representation via Clustering for Depth Completion
Authors: Chen Shenglun, Zhang Hong, Ma XinZhu, Wang Zhihui, Li Haojie
Abstract summary: We propose a novel clustering-based framework called CluDe to learn the pixel-wise and continuous depth representation. CluDe successfully reduces depth smearing around object boundaries by utilizing pixel-wise and continuous depth representation. CluDe achieves state-of-the-art performance on the VOID datasets and outperforms classification-based methods on the KITTI dataset.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth completion is a long-standing challenge in computer vision, where classification-based methods have made tremendous progress in recent years. However, most existing classification-based methods rely on pre-defined pixel-shared and discrete depth values as depth categories. This representation fails to capture the continuous depth values that conform to the real depth distribution, leading to depth smearing in boundary regions. To address this issue, we revisit depth completion from the clustering perspective and propose a novel clustering-based framework called CluDe which focuses on learning the pixel-wise and continuous depth representation. The key idea of CluDe is to iteratively update the pixel-shared and discrete depth representation to its corresponding pixel-wise and continuous counterpart, driven by the real depth distribution. Specifically, CluDe first utilizes depth value clustering to learn a set of depth centers as the depth representation. While these depth centers are pixel-shared and discrete, they are more in line with the real depth distribution compared to pre-defined depth categories. Then, CluDe estimates offsets for these depth centers, enabling their dynamic adjustment along the depth axis of the depth distribution to generate the pixel-wise and continuous depth representation. Extensive experiments demonstrate that CluDe successfully reduces depth smearing around object boundaries by utilizing pixel-wise and continuous depth representation. Furthermore, CluDe achieves state-of-the-art performance on the VOID datasets and outperforms classification-based methods on the KITTI dataset.

Related papers

Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries [9.723762227632378]
We present a novel approach to robustly measure object depths from photon-limited images along the defocused boundaries. It is based on a new image patch representation, Blurry-Edges, that explicitly stores and visualizes a rich set of low-level patch information, including boundaries, color, and smoothness.
arXiv Detail & Related papers (2025-03-30T22:17:00Z)
Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion [51.69876947593144]
Existing methods for depth completion operate in tightly constrained settings. Inspired by advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation. Marigold-DC builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance.
arXiv Detail & Related papers (2024-12-18T00:06:41Z)
Depth-guided Texture Diffusion for Image Semantic Segmentation [47.46257473475867]
We introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge. Our method extracts low-level features from edges and textures to create a texture image. By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image.
arXiv Detail & Related papers (2024-08-17T04:55:03Z)
Progressive Depth Decoupling and Modulating for Flexible Depth Completion [28.693100885012008]
Image-guided depth completion aims at generating a dense depth map from sparse LiDAR data and RGB image. Recent methods have shown promising performance by reformulating it as a classification problem with two sub-tasks: depth discretization and probability prediction. We propose a progressive depth decoupling and modulating network, which incrementally decouples the depth range into bins and adaptively generates multi-scale dense depth maps.
arXiv Detail & Related papers (2024-05-15T13:45:33Z)
RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion [31.70022495622075]
We explore a repetitive design in our image guided network to gradually and sufficiently recover depth values. In the former branch, we design a dense repetitive hourglass network (DRHN) to extract discriminative image features of complex environments. In the latter branch, we present a repetitive guidance (RG) module based on dynamic convolution, in which an efficient convolution factorization is proposed to reduce the complexity. In addition, we propose a region-aware spatial propagation network (RASPN) for further depth refinement based on the semantic prior constraint.
arXiv Detail & Related papers (2023-09-01T09:11:20Z)
Depth Completion using Plane-Residual Representation [84.63079529738924]
We introduce a novel way of interpreting depth information with the closest depth plane label $p$ and a residual value $r$, as we call it, Plane-Residual (PR) representation. By interpreting depth information in PR representation and using our corresponding depth completion network, we were able to acquire improved depth completion performance with faster computation.
arXiv Detail & Related papers (2021-04-15T10:17:53Z)
Learning Depth via Leveraging Semantics: Self-supervised Monocular Depth Estimation with Both Implicit and Explicit Semantic Guidance [34.62415122883441]
We propose a Semantic-aware Spatial Feature Alignment scheme to align implicit semantic features with depth features for scene-aware depth estimation. We also propose a semantic-guided ranking loss to explicitly constrain the estimated depth maps to be consistent with real scene contextual properties. Our method produces high quality depth maps which are consistently superior either on complex scenes or diverse semantic categories.
arXiv Detail & Related papers (2021-02-11T14:29:51Z)
Semantic-Guided Representation Enhancement for Self-supervised Monocular Trained Depth Estimation [39.845944724079814]
Self-supervised depth estimation has shown its great effectiveness in producing high quality depth maps given only image sequences as input. However, its performance usually drops when estimating on border areas or objects with thin structures due to the limited depth representation ability. We propose a semantic-guided depth representation enhancement method, which promotes both local and global depth feature representations.
arXiv Detail & Related papers (2020-12-15T02:24:57Z)
Efficient Depth Completion Using Learned Bases [94.0808155168311]
We propose a new global geometry constraint for depth completion. By assuming depth maps often lay on low dimensional subspaces, a dense depth map can be approximated by a weighted sum of full-resolution principal depth bases.
arXiv Detail & Related papers (2020-12-02T11:57:37Z)
Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information. We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z)
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video. Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision. In this work, we rely, instead of different views, on depth from focus cues. We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.