PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis
- URL: http://arxiv.org/abs/2402.17986v3
- Date: Thu, 25 Jul 2024 18:47:03 GMT
- Title: PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis
- Authors: Jason J. Yu, Tristan Aumentado-Armstrong, Fereshteh Forghani, Konstantinos G. Derpanis, Marcus A. Brubaker,
- Abstract summary: We propose a set-based generative model that can simultaneously generate multiple, self-consistent new views.
Our approach is not limited to generating a single image at a time and can condition on a variable number of views.
We show that the model is capable of generating sets of views that have no natural ordering, like loops and binocular trajectories, and significantly outperforms other methods on such tasks.
- Score: 23.967904337714234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers the problem of generative novel view synthesis (GNVS), generating novel, plausible views of a scene given a limited number of known views. Here, we propose a set-based generative model that can simultaneously generate multiple, self-consistent new views, conditioned on any number of views. Our approach is not limited to generating a single image at a time and can condition on a variable number of views. As a result, when generating a large number of views, our method is not restricted to a low-order autoregressive generation approach and is better able to maintain generated image quality over large sets of images. We evaluate our model on standard NVS datasets and show that it outperforms the state-of-the-art image-based GNVS baselines. Further, we show that the model is capable of generating sets of views that have no natural sequential ordering, like loops and binocular trajectories, and significantly outperforms other methods on such tasks.
Related papers
- MultiDiff: Consistent Novel View Synthesis from a Single Image [60.04215655745264]
MultiDiff is a novel approach for consistent novel view synthesis of scenes from a single RGB image.
Our results demonstrate that MultiDiff outperforms state-of-the-art methods on the challenging, real-world datasets RealEstate10K and ScanNet.
arXiv Detail & Related papers (2024-06-26T17:53:51Z) - GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping [47.38125925469167]
We propose a semantic-preserving generative warping framework to generate novel views from a single image.
Our approach addresses the limitations of existing methods by conditioning the generative model on source view images.
Our model outperforms existing methods in both in-domain and out-of-domain scenarios.
arXiv Detail & Related papers (2024-05-27T15:07:04Z) - ViewFusion: Learning Composable Diffusion Models for Novel View
Synthesis [47.57948804514928]
This work introduces ViewFusion, a state-of-the-art end-to-end generative approach to novel view synthesis.
ViewFusion consists in simultaneously applying a diffusion denoising step to any number of input views of a scene.
arXiv Detail & Related papers (2024-02-05T11:22:14Z) - SyncDreamer: Generating Multiview-consistent Images from a Single-view Image [59.75474518708409]
A novel diffusion model called SyncDreamer generates multiview-consistent images from a single-view image.
Experiments show that SyncDreamer generates images with high consistency across different views.
arXiv Detail & Related papers (2023-09-07T02:28:04Z) - Multi-object Video Generation from Single Frame Layouts [84.55806837855846]
We propose a video generative framework capable of synthesizing global scenes with local objects.
Our framework is a non-trivial adaptation from image generation methods, and is new to this field.
Our model has been evaluated on two widely-used video recognition benchmarks.
arXiv Detail & Related papers (2023-05-06T09:07:01Z) - InfiniteNature-Zero: Learning Perpetual View Generation of Natural
Scenes from Single Images [83.37640073416749]
We present a method for learning to generate flythrough videos of natural scenes starting from a single view.
This capability is learned from a collection of single photographs, without requiring camera poses or even multiple views of each scene.
arXiv Detail & Related papers (2022-07-22T15:41:06Z) - Deep View Synthesis via Self-Consistent Generative Network [41.34461086700849]
View synthesis aims to produce unseen views from a set of views captured by two or more cameras at different positions.
To address this issue, most existing methods seek to exploit the geometric information to match pixels.
We propose a novel deep generative model, called Self-Consistent Generative Network (SCGN), which synthesizes novel views without explicitly exploiting the geometric information.
arXiv Detail & Related papers (2021-01-19T10:56:00Z) - Infinite Nature: Perpetual View Generation of Natural Scenes from a
Single Image [73.56631858393148]
We introduce the problem of perpetual view generation -- long-range generation of novel views corresponding to an arbitrarily long camera trajectory given a single image.
We take a hybrid approach that integrates both geometry and image synthesis in an iterative render, refine, and repeat framework.
Our approach can be trained from a set of monocular video sequences without any manual annotation.
arXiv Detail & Related papers (2020-12-17T18:59:57Z) - Single-View View Synthesis with Multiplane Images [64.46556656209769]
We apply deep learning to generate multiplane images given two or more input images at known viewpoints.
Our method learns to predict a multiplane image directly from a single image input.
It additionally generates reasonable depth maps and fills in content behind the edges of foreground objects in background layers.
arXiv Detail & Related papers (2020-04-23T17:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.