PixelSynth: Generating a 3D-Consistent Experience from a Single Image
- URL: http://arxiv.org/abs/2108.05892v1
- Date: Thu, 12 Aug 2021 17:59:31 GMT
- Title: PixelSynth: Generating a 3D-Consistent Experience from a Single Image
- Authors: Chris Rockwell, David F. Fouhey, Justin Johnson
- Abstract summary: We present an approach that fuses 3D reasoning with autoregressive modeling to outpaint large view changes in a 3D-consistent manner.
We demonstrate considerable improvement in single image large-angle view synthesis results compared to a variety of methods and possible variants.
- Score: 30.64117903216323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in differentiable rendering and 3D reasoning have driven
exciting results in novel view synthesis from a single image. Despite realistic
results, methods are limited to relatively small view change. In order to
synthesize immersive scenes, models must also be able to extrapolate. We
present an approach that fuses 3D reasoning with autoregressive modeling to
outpaint large view changes in a 3D-consistent manner, enabling scene
synthesis. We demonstrate considerable improvement in single image large-angle
view synthesis results compared to a variety of methods and possible variants
across simulated and real datasets. In addition, we show increased 3D
consistency compared to alternative accumulation methods. Project website:
https://crockwell.github.io/pixelsynth/
Related papers
- Hybrid bundle-adjusting 3D Gaussians for view consistent rendering with pose optimization [2.8990883469500286]
We introduce a hybrid bundle-adjusting 3D Gaussians model that enables view-consistent rendering with pose optimization.
This model jointly extract image-based and neural 3D representations to simultaneously generate view-consistent images and camera poses within forward-facing scenes.
arXiv Detail & Related papers (2024-10-17T07:13:00Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space [77.92350895927922]
We propose WildFusion, a new approach to 3D-aware image synthesis based on latent diffusion models (LDMs)
Our 3D-aware LDM is trained without any direct supervision from multiview images or 3D geometry.
This opens up promising research avenues for scalable 3D-aware image synthesis and 3D content creation from in-the-wild image data.
arXiv Detail & Related papers (2023-11-22T18:25:51Z) - Generative Novel View Synthesis with 3D-Aware Diffusion Models [96.78397108732233]
We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image.
Our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume.
In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences.
arXiv Detail & Related papers (2023-04-05T17:15:47Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Novel View Synthesis with Diffusion Models [56.55571338854636]
We present 3DiM, a diffusion model for 3D novel view synthesis.
It is able to translate a single input view into consistent and sharp completions across many views.
3DiM can generate multiple views that are 3D consistent using a novel technique called conditioning.
arXiv Detail & Related papers (2022-10-06T16:59:56Z) - VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids [42.74658047803192]
State-of-the-art 3D-aware generative models rely on coordinate-based parameterize 3D radiance fields.
Existing approaches often render low-resolution feature maps and process them with an upsampling network to obtain the final image.
In contrast to existing approaches, our method requires only a single forward pass to generate a full 3D scene.
arXiv Detail & Related papers (2022-06-15T17:44:22Z) - A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware
Image Synthesis [163.96778522283967]
We propose a shading-guided generative implicit model that is able to learn a starkly improved shape representation.
An accurate 3D shape should also yield a realistic rendering under different lighting conditions.
Our experiments on multiple datasets show that the proposed approach achieves photorealistic 3D-aware image synthesis.
arXiv Detail & Related papers (2021-10-29T10:53:12Z) - Geometry-Free View Synthesis: Transformers and no 3D Priors [16.86600007830682]
We show that a transformer-based model can synthesize entirely novel views without any hand-engineered 3D biases.
This is achieved by (i) a global attention mechanism for implicitly learning long-range 3D correspondences between source and target views.
arXiv Detail & Related papers (2021-04-15T17:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.