Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on
Dataset Mixtures with Uncalibrated Stereo Data
- URL: http://arxiv.org/abs/2306.02878v1
- Date: Mon, 5 Jun 2023 13:49:24 GMT
- Title: Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on
Dataset Mixtures with Uncalibrated Stereo Data
- Authors: Nikolay Patakin, Mikhail Romanov, Anna Vorontsova, Mikhail Artemyev,
Anton Konushin
- Abstract summary: We propose GP$2$, General-Purpose and Geometry-Preserving training scheme for single-view depth estimation.
We show that GP$2$-trained models outperform methods relying on PCM in both accuracy and speed.
We also show that SVDE models can learn to predict geometrically correct depth even when geometrically complete data comprises the minor part of the training set.
- Score: 4.199844472131922
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Nowadays, robotics, AR, and 3D modeling applications attract considerable
attention to single-view depth estimation (SVDE) as it allows estimating scene
geometry from a single RGB image. Recent works have demonstrated that the
accuracy of an SVDE method hugely depends on the diversity and volume of the
training data. However, RGB-D datasets obtained via depth capturing or 3D
reconstruction are typically small, synthetic datasets are not photorealistic
enough, and all these datasets lack diversity. The large-scale and diverse data
can be sourced from stereo images or stereo videos from the web. Typically
being uncalibrated, stereo data provides disparities up to unknown shift
(geometrically incomplete data), so stereo-trained SVDE methods cannot recover
3D geometry. It was recently shown that the distorted point clouds obtained
with a stereo-trained SVDE method can be corrected with additional point cloud
modules (PCM) separately trained on the geometrically complete data. On the
contrary, we propose GP$^{2}$, General-Purpose and Geometry-Preserving training
scheme, and show that conventional SVDE models can learn correct shifts
themselves without any post-processing, benefiting from using stereo data even
in the geometry-preserving setting. Through experiments on different dataset
mixtures, we prove that GP$^{2}$-trained models outperform methods relying on
PCM in both accuracy and speed, and report the state-of-the-art results in the
general-purpose geometry-preserving SVDE. Moreover, we show that SVDE models
can learn to predict geometrically correct depth even when geometrically
complete data comprises the minor part of the training set.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Normal Transformer: Extracting Surface Geometry from LiDAR Points
Enhanced by Visual Semantics [6.516912796655748]
This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images.
We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data.
arXiv Detail & Related papers (2022-11-19T03:55:09Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Towards General Purpose Geometry-Preserving Single-View Depth Estimation [1.9573380763700712]
Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics.
Recent works have shown that a successful solution strongly relies on the diversity and volume of training data.
Our work shows that a model trained on this data along with conventional datasets can gain accuracy while predicting correct scene geometry.
arXiv Detail & Related papers (2020-09-25T20:06:13Z) - DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data [110.29043712400912]
We present a method for depth estimation with monocular images, which can predict high-quality depth on diverse scenes up to an affine transformation.
Experiments show that our method outperforms previous methods on 8 datasets by a large margin with the zero-shot test setting.
arXiv Detail & Related papers (2020-02-03T05:38:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.