Related papers: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

URL: http://arxiv.org/abs/2108.08574v1
Date: Thu, 19 Aug 2021 09:26:13 GMT
Title: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation
Authors: Boying Li, Yuan Huang, Zeyu Liu, Danping Zou, and Wenxian Yu
Abstract summary: Self-supervised monocular depth estimation has achieved impressive performance on outdoor datasets. But its performance degrades notably in indoor environments because of the lack of textures. We leverage the structural regularities exhibited in indoor scenes, to train a better depth network.
Score: 7.028319464940422
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised monocular depth estimation has achieved impressive performance on outdoor datasets. Its performance however degrades notably in indoor environments because of the lack of textures. Without rich textures, the photometric consistency is too weak to train a good depth network. Inspired by the early works on indoor modeling, we leverage the structural regularities exhibited in indoor scenes, to train a better depth network. Specifically, we adopt two extra supervisory signals for self-supervised training: 1) the Manhattan normal constraint and 2) the co-planar constraint. The Manhattan normal constraint enforces the major surfaces (the floor, ceiling, and walls) to be aligned with dominant directions. The co-planar constraint states that the 3D points be well fitted by a plane if they are located within the same planar region. To generate the supervisory signals, we adopt two components to classify the major surface normal into dominant directions and detect the planar regions on the fly during training. As the predicted depth becomes more accurate after more training epochs, the supervisory signals also improve and in turn feedback to obtain a better depth model. Through extensive experiments on indoor benchmark datasets, the results show that our network outperforms the state-of-the-art methods. The source code is available at https://github.com/SJTU-ViSYS/StructDepth .

Related papers

DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation. We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features. We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z)
Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation [38.81275292687583]
We propose Plane2Depth, which adaptively utilizes plane information to improve depth prediction within a hierarchical framework. In the proposed plane guided depth generator (PGDG), we design a set of plane queries as prototypes to softly model planes in the scene and predict per-pixel plane coefficients. In the proposed adaptive plane query aggregation (APGA) module, we introduce a novel feature interaction approach to improve the aggregation of multi-scale plane features.
arXiv Detail & Related papers (2024-09-04T07:45:06Z)
GAM-Depth: Self-Supervised Indoor Depth Estimation Leveraging a Gradient-Aware Mask and Semantic Constraints [12.426365333096264]
We propose GAM-Depth, developed upon two novel components: gradient-aware mask and semantic constraints. The gradient-aware mask enables adaptive and robust supervision for both key areas and textureless regions. The incorporation of semantic constraints for indoor self-supervised depth estimation improves depth discrepancies at object boundaries.
arXiv Detail & Related papers (2024-02-22T07:53:34Z)
Deeper into Self-Supervised Monocular Indoor Depth Estimation [7.30562653023176]
Self-supervised learning of indoor depth from monocular sequences is quite challenging for researchers. In this work, our proposed method, named IndoorDepth, consists of two innovations. Experiments on the NYUv2 benchmark demonstrate that our IndoorDepth outperforms the previous state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-12-03T04:55:32Z)
SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions. We introduce an external pretrained monocular depth estimation model for generating single-image depth prior. Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z)
DevNet: Self-supervised Monocular Depth Learning via Density Volume Construction [51.96971077984869]
Self-supervised depth learning from monocular images normally relies on the 2D pixel-wise photometric relation between temporally adjacent image frames. This work proposes Density Volume Construction Network (DevNet), a novel self-supervised monocular depth learning framework.
arXiv Detail & Related papers (2022-09-14T00:08:44Z)
Joint Prediction of Monocular Depth and Structure using Planar and Parallax Geometry [4.620624344434533]
Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data. We propose a novel approach combining structure information from a promising Plane and Parallax geometry pipeline with depth information into a U-Net supervised learning network. Our model has impressive performance on depth prediction of thin objects and edges, and compared to structure prediction baseline, our model performs more robustly.
arXiv Detail & Related papers (2022-07-13T17:04:05Z)
P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior [133.76192155312182]
We propose a method that learns to selectively leverage information from coplanar pixels to improve the predicted depth. An extensive evaluation of our method shows that we set the new state of the art in supervised monocular depth estimation.
arXiv Detail & Related papers (2022-04-05T10:03:52Z)
PLNet: Plane and Line Priors for Unsupervised Indoor Depth Estimation [15.751045404065465]
This paper proposes PLNet that leverages the plane and line priors to enhance the depth estimation. Experiments on NYU Depth V2 and ScanNet show that PLNet outperforms existing methods.
arXiv Detail & Related papers (2021-10-12T09:02:24Z)
MonoIndoor: Towards Good Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments [55.05401912853467]
Self-supervised depth estimation for indoor environments is more challenging than its outdoor counterpart. The depth range of indoor sequences varies a lot across different frames, making it difficult for the depth network to induce consistent depth cues. The maximum distance in outdoor scenes mostly stays the same as the camera usually sees the sky. The motions of outdoor sequences are pre-dominantly translational, especially for driving datasets such as KITTI.
arXiv Detail & Related papers (2021-07-26T18:45:14Z)
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video. Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
Don't Forget The Past: Recurrent Depth Estimation from Monocular Video [92.84498980104424]
We put three different types of depth estimation into a common framework. Our method produces a time series of depth maps. It can be applied to monocular videos only or be combined with different types of sparse depth patterns.
arXiv Detail & Related papers (2020-01-08T16:50:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.