Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image
- URL: http://arxiv.org/abs/2208.13241v1
- Date: Sun, 28 Aug 2022 16:20:14 GMT
- Title: Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image
- Authors: Wei Yin, Jianming Zhang, Oliver Wang, Simon Nicklaus, Simon Chen,
Yifan Liu, Chunhua Shen
- Abstract summary: We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
- Score: 91.71077190961688
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite significant progress made in the past few years, challenges remain
for depth estimation using a single monocular image. First, it is nontrivial to
train a metric-depth prediction model that can generalize well to diverse
scenes mainly due to limited training data. Thus, researchers have built
large-scale relative depth datasets that are much easier to collect. However,
existing relative depth estimation models often fail to recover accurate 3D
scene shapes due to the unknown depth shift caused by training with the
relative depth data. We tackle this problem here and attempt to estimate
accurate scene shapes by training on large-scale relative depth data, and
estimating the depth shift. To do so, we propose a two-stage framework that
first predicts depth up to an unknown scale and shift from a single monocular
image, and then exploits 3D point cloud data to predict the depth shift and the
camera's focal length that allow us to recover 3D scene shapes. As the two
modules are trained separately, we do not need strictly paired training data.
In addition, we propose an image-level normalized regression loss and a
normal-based geometry loss to improve training with relative depth annotation.
We test our depth model on nine unseen datasets and achieve state-of-the-art
performance on zero-shot evaluation. Code is available at: https://git.io/Depth
Related papers
- Robust Geometry-Preserving Depth Estimation Using Differentiable
Rendering [93.94371335579321]
We propose a learning framework that trains models to predict geometry-preserving depth without requiring extra data or annotations.
Comprehensive experiments underscore our framework's superior generalization capabilities.
Our innovative loss functions empower the model to autonomously recover domain-specific scale-and-shift coefficients.
arXiv Detail & Related papers (2023-09-18T12:36:39Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Depth Is All You Need for Monocular 3D Detection [29.403235118234747]
We propose to align depth representation with the target domain in unsupervised fashions.
Our methods leverage commonly available LiDAR or RGB videos during training time to fine-tune the depth representation, which leads to improved 3D detectors.
arXiv Detail & Related papers (2022-10-05T18:12:30Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Learning to Recover 3D Scene Shape from a Single Image [98.20106822614392]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape.
arXiv Detail & Related papers (2020-12-17T02:35:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.