Novel View Synthesis from a Single Image via Unsupervised learning
- URL: http://arxiv.org/abs/2110.15569v1
- Date: Fri, 29 Oct 2021 06:32:49 GMT
- Title: Novel View Synthesis from a Single Image via Unsupervised learning
- Authors: Bingzheng Liu, Jianjun Lei, Bo Peng, Chuanbo Yu, Wanqing Li, Nam Ling
- Abstract summary: We propose an unsupervised network to learn such a pixel transformation from a single source viewpoint.
The learned transformation allows us to synthesize a novel view from any single source viewpoint image of unknown pose.
- Score: 27.639536023956122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: View synthesis aims to generate novel views from one or more given source
views. Although existing methods have achieved promising performance, they
usually require paired views of different poses to learn a pixel
transformation. This paper proposes an unsupervised network to learn such a
pixel transformation from a single source viewpoint. In particular, the network
consists of a token transformation module (TTM) that facilities the
transformation of the features extracted from a source viewpoint image into an
intrinsic representation with respect to a pre-defined reference pose and a
view generation module (VGM) that synthesizes an arbitrary view from the
representation. The learned transformation allows us to synthesize a novel view
from any single source viewpoint image of unknown pose. Experiments on the
widely used view synthesis datasets have demonstrated that the proposed network
is able to produce comparable results to the state-of-the-art methods despite
the fact that learning is unsupervised and only a single source viewpoint image
is required for generating a novel view. The code will be available soon.
Related papers
- View-Invariant Policy Learning via Zero-Shot Novel View Synthesis [26.231630397802785]
We investigate how knowledge from large-scale visual data of the world may be used to address one axis of variation for generalizable manipulation: observational viewpoint.
We study single-image novel view synthesis models, which learn 3D-aware scene-level priors by rendering images of the same scene from alternate camera viewpoints.
For practical application to diverse robotic data, these models must operate zero-shot, performing view synthesis on unseen tasks and environments.
arXiv Detail & Related papers (2024-09-05T16:39:21Z) - FSViewFusion: Few-Shots View Generation of Novel Objects [75.81872204650807]
We introduce a pretrained stable diffusion model for view synthesis without explicit 3D priors.
Specifically, we base our method on a personalized text to image model, Dreambooth, given its strong ability to adapt to specific novel objects with a few shots.
We establish that the concept of a view can be disentangled and transferred to a novel object irrespective of the original object's identify from which the views are learnt.
arXiv Detail & Related papers (2024-03-11T02:59:30Z) - UpFusion: Novel View Diffusion from Unposed Sparse View Observations [66.36092764694502]
UpFusion can perform novel view synthesis and infer 3D representations for an object given a sparse set of reference images.
We show that this mechanism allows generating high-fidelity novel views while improving the synthesis quality given additional (unposed) images.
arXiv Detail & Related papers (2023-12-11T18:59:55Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Self-Supervised Visibility Learning for Novel View Synthesis [79.53158728483375]
Conventional rendering methods estimate scene geometry and synthesize novel views in two separate steps.
We propose an end-to-end NVS framework to eliminate the error propagation issue.
Our network is trained in an end-to-end self-supervised fashion, thus significantly alleviating error accumulation in view synthesis.
arXiv Detail & Related papers (2021-03-29T08:11:25Z) - Unsupervised Novel View Synthesis from a Single Image [47.37120753568042]
Novel view synthesis from a single image aims at generating novel views from a single input image of an object.
This work aims at relaxing this assumption enabling training of conditional generative model for novel view synthesis in a completely unsupervised manner.
arXiv Detail & Related papers (2021-02-05T16:56:04Z) - Deep View Synthesis via Self-Consistent Generative Network [41.34461086700849]
View synthesis aims to produce unseen views from a set of views captured by two or more cameras at different positions.
To address this issue, most existing methods seek to exploit the geometric information to match pixels.
We propose a novel deep generative model, called Self-Consistent Generative Network (SCGN), which synthesizes novel views without explicitly exploiting the geometric information.
arXiv Detail & Related papers (2021-01-19T10:56:00Z) - AUTO3D: Novel view synthesis through unsupervisely learned variational
viewpoint and global 3D representation [27.163052958878776]
This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision.
We construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation.
Our system can achieve implicitly 3D understanding without explicitly 3D reconstruction.
arXiv Detail & Related papers (2020-07-13T18:51:27Z) - Single-View View Synthesis with Multiplane Images [64.46556656209769]
We apply deep learning to generate multiplane images given two or more input images at known viewpoints.
Our method learns to predict a multiplane image directly from a single image input.
It additionally generates reasonable depth maps and fills in content behind the edges of foreground objects in background layers.
arXiv Detail & Related papers (2020-04-23T17:59:19Z) - Sequential View Synthesis with Transformer [13.200139959163574]
We introduce a sequential rendering decoder to predict an image sequence, including the target view, based on the learned representations.
We evaluate our model on various challenging datasets and demonstrate that our model not only gives consistent predictions but also doesn't require any retraining for finetuning.
arXiv Detail & Related papers (2020-04-09T14:15:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.