Learning to Recover 3D Scene Shape from a Single Image
- URL: http://arxiv.org/abs/2012.09365v1
- Date: Thu, 17 Dec 2020 02:35:13 GMT
- Title: Learning to Recover 3D Scene Shape from a Single Image
- Authors: Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon
Chen, Chunhua Shen
- Abstract summary: We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape.
- Score: 98.20106822614392
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Despite significant progress in monocular depth estimation in the wild,
recent state-of-the-art methods cannot be used to recover accurate 3D scene
shape due to an unknown depth shift induced by shift-invariant reconstruction
losses used in mixed-data depth prediction training, and possible unknown
camera focal length. We investigate this problem in detail, and propose a
two-stage framework that first predicts depth up to an unknown scale and shift
from a single monocular image, and then use 3D point cloud encoders to predict
the missing depth shift and focal length that allow us to recover a realistic
3D scene shape. In addition, we propose an image-level normalized regression
loss and a normal-based geometry loss to enhance depth prediction models
trained on mixed datasets. We test our depth model on nine unseen datasets and
achieve state-of-the-art performance on zero-shot dataset generalization. Code
is available at: https://git.io/Depth
Related papers
- Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular
Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.
We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points.
Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.