Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering
- URL: http://arxiv.org/abs/2309.09724v1
- Date: Mon, 18 Sep 2023 12:36:39 GMT
- Title: Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering
- Authors: Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey
Tianyi Zhou, Chunhua Shen
- Abstract summary: We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
- Score: 93.94371335579321
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we address the challenge of 3D scene structure recovery from
monocular depth estimation. While traditional depth estimation methods leverage
labeled datasets to directly predict absolute depth, recent advancements
advocate for mix-dataset training, enhancing generalization across diverse
scenes. However, such mixed dataset training yields depth predictions only up
to an unknown scale and shift, hindering accurate 3D reconstructions. Existing
solutions necessitate extra 3D datasets or geometry-complete depth annotations,
constraints that limit their versatility. In this paper, we propose a learning
framework that trains models to predict geometry-preserving depth without
requiring extra data or annotations. To produce realistic 3D structures, we
render novel views of the reconstructed scenes and design loss functions to
promote depth estimation consistency across different views. Comprehensive
experiments underscore our framework's superior generalization capabilities,
surpassing existing state-of-the-art methods on several benchmark datasets
without leveraging extra training information. Moreover, our innovative loss
functions empower the model to autonomously recover domain-specific
scale-and-shift coefficients using solely unlabeled images.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - A Fusion of Variational Distribution Priors and Saliency Map Replay for
Continual 3D Reconstruction [1.3812010983144802]
Single-image 3D reconstruction is a research challenge focused on predicting 3D object shapes from single-view images.
This task requires significant data acquisition to predict both visible and occluded portions of the shape.
We propose a continual learning-based 3D reconstruction method where our goal is to design a model using Variational Priors that can still reconstruct the previously seen classes reasonably even after training on new classes.
arXiv Detail & Related papers (2023-08-17T06:48:55Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Learning to Recover 3D Scene Shape from a Single Image [98.20106822614392]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape.
arXiv Detail & Related papers (2020-12-17T02:35:13Z) - Improving Monocular Depth Estimation by Leveraging Structural Awareness
and Complementary Datasets [21.703238902823937]
We propose a structure-aware neural network with spatial attention blocks to exploit the spatial relationship of visual features.
Second, we introduce a global focal relative loss for uniform point pairs to enhance spatial constraint in the prediction.
Third, based on analysis of failure cases for prior methods, we collect a new Hard Case (HC) Depth dataset of challenging scenes.
arXiv Detail & Related papers (2020-07-22T08:21:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.