Layout-Guided Novel View Synthesis from a Single Indoor Panorama
- URL: http://arxiv.org/abs/2103.17022v1
- Date: Wed, 31 Mar 2021 12:12:22 GMT
- Title: Layout-Guided Novel View Synthesis from a Single Indoor Panorama
- Authors: Jiale Xu and Jia Zheng and Yanyu Xu and Rui Tang and Shenghua Gao
- Abstract summary: We make the first attempt to generate novel views from a single indoor panorama.
CNNs are used to extract the deep features and estimate the depth map from the source-view image.
We also constrain the room layout of the generated target-view images to enforce geometric consistency.
- Score: 41.627708450356614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing view synthesis methods mainly focus on the perspective images and
have shown promising results. However, due to the limited field-of-view of the
pinhole camera, the performance quickly degrades when large camera movements
are adopted. In this paper, we make the first attempt to generate novel views
from a single indoor panorama and take the large camera translations into
consideration. To tackle this challenging problem, we first use Convolutional
Neural Networks (CNNs) to extract the deep features and estimate the depth map
from the source-view image. Then, we leverage the room layout prior, a strong
structural constraint of the indoor scene, to guide the generation of target
views. More concretely, we estimate the room layout in the source view and
transform it into the target viewpoint as guidance. Meanwhile, we also
constrain the room layout of the generated target-view images to enforce
geometric consistency. To validate the effectiveness of our method, we further
build a large-scale photo-realistic dataset containing both small and large
camera translations. The experimental results on our challenging dataset
demonstrate that our method achieves state-of-the-art performance. The project
page is at https://github.com/bluestyle97/PNVS.
Related papers
- SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments [4.2603120588176635]
SPVLoc is a global indoor localization method that accurately determines the 6D camera pose of a query image.
Our approach employs a novel matching procedure to localize the perspective camera's viewport.
It achieves superior localization accuracy compared to the state of the art methods and also estimates more degrees of freedom of the camera pose.
arXiv Detail & Related papers (2024-04-16T12:55:15Z) - MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering [91.76893697171117]
We propose a method for efficient and high-quality geometry recovery and novel view synthesis given very sparse or even a single view of the human.
Our key idea is to meta-learn the radiance field weights solely from potentially sparse multi-view videos.
We collect a new dataset, WildDynaCap, which contains subjects captured in, both, a dense camera dome and in-the-wild sparse camera rigs.
arXiv Detail & Related papers (2024-03-27T17:59:54Z) - GasMono: Geometry-Aided Self-Supervised Monocular Depth Estimation for
Indoor Scenes [47.76269541664071]
This paper tackles the challenges of self-supervised monocular depth estimation in indoor scenes caused by large rotation between frames and low texture.
We obtain coarse camera poses from monocular sequences through multi-view geometry to deal with the former.
To soften the effect of the low texture, we combine the global reasoning of vision transformers with an overfitting-aware, iterative self-distillation mechanism.
arXiv Detail & Related papers (2023-09-26T17:59:57Z) - SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input
Views [16.72880076920758]
We present SparseGNV, a learning framework that incorporates 3D structures and image generative models to generate novel views.
SparseGNV is trained across a large indoor scene dataset to learn generalizable priors.
It can efficiently generate novel views of an unseen indoor scene in a feed-forward manner.
arXiv Detail & Related papers (2023-05-11T17:58:37Z) - Free View Synthesis [100.86844680362196]
We present a method for novel view synthesis from input images that are freely distributed around a scene.
Our method does not rely on a regular arrangement of input views, can synthesize images for free camera movement through the scene, and works for general scenes with unconstrained geometric layouts.
arXiv Detail & Related papers (2020-08-12T18:16:08Z) - Shape and Viewpoint without Keypoints [63.26977130704171]
We present a learning framework that learns to recover the 3D shape, pose and texture from a single image.
We trained on an image collection without any ground truth 3D shape, multi-view, camera viewpoints or keypoint supervision.
We obtain state-of-the-art camera prediction results and show that we can learn to predict diverse shapes and textures across objects.
arXiv Detail & Related papers (2020-07-21T17:58:28Z) - Single-View View Synthesis with Multiplane Images [64.46556656209769]
We apply deep learning to generate multiplane images given two or more input images at known viewpoints.
Our method learns to predict a multiplane image directly from a single image input.
It additionally generates reasonable depth maps and fills in content behind the edges of foreground objects in background layers.
arXiv Detail & Related papers (2020-04-23T17:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.