Unsupervised Novel View Synthesis from a Single Image
- URL: http://arxiv.org/abs/2102.03285v1
- Date: Fri, 5 Feb 2021 16:56:04 GMT
- Title: Unsupervised Novel View Synthesis from a Single Image
- Authors: Pierluigi Zama Ramirez, Alessio Tonioni, Federico Tombari
- Abstract summary: Novel view synthesis from a single image aims at generating novel views from a single input image of an object.
This work aims at relaxing this assumption enabling training of conditional generative model for novel view synthesis in a completely unsupervised manner.
- Score: 47.37120753568042
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Novel view synthesis from a single image aims at generating novel views from
a single input image of an object. Several works recently achieved remarkable
results, though require some form of multi-view supervision at training time,
therefore limiting their deployment in real scenarios. This work aims at
relaxing this assumption enabling training of conditional generative model for
novel view synthesis in a completely unsupervised manner. We first pre-train a
purely generative decoder model using a GAN formulation while at the same time
training an encoder network to invert the mapping from latent code to images.
Then we swap encoder and decoder and train the network as a conditioned GAN
with a mixture of auto-encoder-like objective and self-distillation. At test
time, given a view of an object, our model first embeds the image content in a
latent code and regresses its pose w.r.t. a canonical reference system, then
generates novel views of it by keeping the code and varying the pose. We show
that our framework achieves results comparable to the state of the art on
ShapeNet and that it can be employed on unconstrained collections of natural
images, where no competing method can be trained.
Related papers
- Image Generation from Image Captioning -- Invertible Approach [0.0]
We train an invertible model that learns a one-to-one mapping between the image and text embeddings.
Once the invertible model is efficiently trained on one task, the image captioning, the same model can generate new images for a given text.
arXiv Detail & Related papers (2024-10-26T13:02:58Z) - UpFusion: Novel View Diffusion from Unposed Sparse View Observations [66.36092764694502]
UpFusion can perform novel view synthesis and infer 3D representations for an object given a sparse set of reference images.
We show that this mechanism allows generating high-fidelity novel views while improving the synthesis quality given additional (unposed) images.
arXiv Detail & Related papers (2023-12-11T18:59:55Z) - Not All Image Regions Matter: Masked Vector Quantization for
Autoregressive Image Generation [78.13793505707952]
Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.
We propose a novel two-stage framework, which consists of Masked Quantization VAE (MQ-VAE) Stack model from modeling redundancy.
arXiv Detail & Related papers (2023-05-23T02:15:53Z) - im2nerf: Image to Neural Radiance Field in the Wild [47.18702901448768]
im2nerf is a learning framework that predicts a continuous neural object representation given a single input image in the wild.
We show that im2nerf achieves the state-of-the-art performance for novel view synthesis from a single-view unposed image in the wild.
arXiv Detail & Related papers (2022-09-08T23:28:56Z) - Neural Rendering of Humans in Novel View and Pose from Monocular Video [68.37767099240236]
We introduce a new method that generates photo-realistic humans under novel views and poses given a monocular video as input.
Our method significantly outperforms existing approaches under unseen poses and novel views given monocular videos as input.
arXiv Detail & Related papers (2022-04-04T03:09:20Z) - Novel View Synthesis from a Single Image via Unsupervised learning [27.639536023956122]
We propose an unsupervised network to learn such a pixel transformation from a single source viewpoint.
The learned transformation allows us to synthesize a novel view from any single source viewpoint image of unknown pose.
arXiv Detail & Related papers (2021-10-29T06:32:49Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Augmentation-Interpolative AutoEncoders for Unsupervised Few-Shot Image
Generation [45.380129419065746]
Augmentation-Interpolative AutoEncoders synthesize realistic images of novel objects from only a few reference images.
Our procedure is simple and lightweight, generalizes broadly, and requires no category labels or other supervision during training.
arXiv Detail & Related papers (2020-11-25T21:18:55Z) - Sequential View Synthesis with Transformer [13.200139959163574]
We introduce a sequential rendering decoder to predict an image sequence, including the target view, based on the learned representations.
We evaluate our model on various challenging datasets and demonstrate that our model not only gives consistent predictions but also doesn't require any retraining for finetuning.
arXiv Detail & Related papers (2020-04-09T14:15:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.