NeRF-VAE: A Geometry Aware 3D Scene Generative Model
- URL: http://arxiv.org/abs/2104.00587v1
- Date: Thu, 1 Apr 2021 16:16:31 GMT
- Title: NeRF-VAE: A Geometry Aware 3D Scene Generative Model
- Authors: Adam R. Kosiorek, Heiko Strathmann, Daniel Zoran, Pol Moreno, Rosalia
Schneider, So\v{n}a Mokr\'a, Danilo J. Rezende
- Abstract summary: We propose NeRF-VAE, a 3D scene generative model that incorporates geometric structure via NeRF and differentiable volume rendering.
NeRF-VAE's explicit 3D rendering process contrasts previous generative models with convolution-based rendering.
We show that, once trained, NeRF-VAE is able to infer and render geometrically-consistent scenes from previously unseen 3D environments.
- Score: 14.593550382914767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose NeRF-VAE, a 3D scene generative model that incorporates geometric
structure via NeRF and differentiable volume rendering. In contrast to NeRF,
our model takes into account shared structure across scenes, and is able to
infer the structure of a novel scene -- without the need to re-train -- using
amortized inference. NeRF-VAE's explicit 3D rendering process further contrasts
previous generative models with convolution-based rendering which lacks
geometric structure. Our model is a VAE that learns a distribution over
radiance fields by conditioning them on a latent scene representation. We show
that, once trained, NeRF-VAE is able to infer and render
geometrically-consistent scenes from previously unseen 3D environments using
very few input images. We further demonstrate that NeRF-VAE generalizes well to
out-of-distribution cameras, while convolutional models do not. Finally, we
introduce and study an attention-based conditioning mechanism of NeRF-VAE's
decoder, which improves model performance.
Related papers
- DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - ReconFusion: 3D Reconstruction with Diffusion Priors [104.73604630145847]
We present ReconFusion to reconstruct real-world scenes using only a few photos.
Our approach leverages a diffusion prior for novel view synthesis, trained on synthetic and multiview datasets.
Our method synthesizes realistic geometry and texture in underconstrained regions while preserving the appearance of observed regions.
arXiv Detail & Related papers (2023-12-05T18:59:58Z) - Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and
Reconstruction [77.69363640021503]
3D-aware image synthesis encompasses a variety of tasks, such as scene generation and novel view synthesis from images.
We present SSDNeRF, a unified approach that employs an expressive diffusion model to learn a generalizable prior of neural radiance fields (NeRF) from multi-view images of diverse objects.
arXiv Detail & Related papers (2023-04-13T17:59:01Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z) - DehazeNeRF: Multiple Image Haze Removal and 3D Shape Reconstruction
using Neural Radiance Fields [56.30120727729177]
We introduce DehazeNeRF as a framework that robustly operates in hazy conditions.
We demonstrate successful multi-view haze removal, novel view synthesis, and 3D shape reconstruction where existing approaches fail.
arXiv Detail & Related papers (2023-03-20T18:03:32Z) - NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from
3D-aware Diffusion [107.67277084886929]
Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input.
We propose NerfDiff, which addresses this issue by distilling the knowledge of a 3D-aware conditional diffusion model (CDM) into NeRF through synthesizing and refining a set of virtual views at test time.
We further propose a novel NeRF-guided distillation algorithm that simultaneously generates 3D consistent virtual views from the CDM samples, and finetunes the NeRF based on the improved virtual views.
arXiv Detail & Related papers (2023-02-20T17:12:00Z) - NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as
General Image Priors [24.05480789681139]
We propose NeRDi, a single-view NeRF synthesis framework with general image priors from 2D diffusion models.
We leverage off-the-shelf vision-language models and introduce a two-section language guidance as conditioning inputs to the diffusion model.
We also demonstrate our generalizability in zero-shot NeRF synthesis for in-the-wild images.
arXiv Detail & Related papers (2022-12-06T19:00:07Z) - FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction
and Expression Editing [27.014582934266492]
We propose a Few-shot Dynamic Neural Radiance Field (FDNeRF), the first NeRF-based method capable of reconstruction and expression editing of 3D faces.
Unlike existing dynamic NeRFs that require dense images as input and can only be modeled for a single identity, our method enables face reconstruction across different persons with few-shot inputs.
arXiv Detail & Related papers (2022-08-11T11:05:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.