Towards General Purpose Geometry-Preserving Single-View Depth Estimation
- URL: http://arxiv.org/abs/2009.12419v2
- Date: Tue, 9 Feb 2021 20:30:35 GMT
- Title: Towards General Purpose Geometry-Preserving Single-View Depth Estimation
- Authors: Mikhail Romanov, Nikolay Patatkin, Anna Vorontsova, Sergey Nikolenko,
Anton Konushin, Dmitry Senyushkin
- Abstract summary: Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics.
Recent works have shown that a successful solution strongly relies on the diversity and volume of training data.
Our work shows that a model trained on this data along with conventional datasets can gain accuracy while predicting correct scene geometry.
- Score: 1.9573380763700712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-view depth estimation (SVDE) plays a crucial role in scene
understanding for AR applications, 3D modeling, and robotics, providing the
geometry of a scene based on a single image. Recent works have shown that a
successful solution strongly relies on the diversity and volume of training
data. This data can be sourced from stereo movies and photos. However, they do
not provide geometrically complete depth maps (as disparities contain unknown
shift value). Therefore, existing models trained on this data are not able to
recover correct 3D representations. Our work shows that a model trained on this
data along with conventional datasets can gain accuracy while predicting
correct scene geometry. Surprisingly, only a small portion of geometrically
correct depth maps are required to train a model that performs equally to a
model trained on the full geometrically correct dataset. After that, we train
computationally efficient models on a mixture of datasets using the proposed
method. Through quantitative comparison on completely unseen datasets and
qualitative comparison of 3D point clouds, we show that our model defines the
new state of the art in general-purpose SVDE.
Related papers
- Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation [32.30055363306321]
We propose a paradigm for seamlessly unifying different human pose and shape-related tasks and datasets.
Our formulation is centered on the ability - both at training and test time - to query any arbitrary point of the human volume.
We can naturally exploit differently annotated data sources including mesh, 2D/3D skeleton and dense pose, without having to convert between them.
arXiv Detail & Related papers (2024-07-10T10:44:18Z) - Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on
Dataset Mixtures with Uncalibrated Stereo Data [4.199844472131922]
We propose GP$2$, General-Purpose and Geometry-Preserving training scheme for single-view depth estimation.
We show that GP$2$-trained models outperform methods relying on PCM in both accuracy and speed.
We also show that SVDE models can learn to predict geometrically correct depth even when geometrically complete data comprises the minor part of the training set.
arXiv Detail & Related papers (2023-06-05T13:49:24Z) - Robust Category-Level 3D Pose Estimation from Synthetic Data [17.247607850702558]
We introduce SyntheticP3D, a new synthetic dataset for object pose estimation generated from CAD models.
We propose a novel approach (CC3D) for training neural mesh models that perform pose estimation via inverse rendering.
arXiv Detail & Related papers (2023-05-25T14:56:03Z) - Learning 3D Human Pose Estimation from Dozens of Datasets using a
Geometry-Aware Autoencoder to Bridge Between Skeleton Formats [80.12253291709673]
We propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks.
Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model.
arXiv Detail & Related papers (2022-12-29T22:22:49Z) - Normal Transformer: Extracting Surface Geometry from LiDAR Points
Enhanced by Visual Semantics [6.516912796655748]
This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images.
We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data.
arXiv Detail & Related papers (2022-11-19T03:55:09Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Generating synthetic photogrammetric data for training deep learning
based 3D point cloud segmentation models [0.0]
At I/ITSEC 2019, the authors presented a fully-automated workflow to segment 3D photogrammetric point-clouds/meshes and extract object information.
The ultimate goal is to create realistic virtual environments and provide the necessary information for simulation.
arXiv Detail & Related papers (2020-08-21T18:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.