GIRAFFE: Representing Scenes as Compositional Generative Neural Feature
Fields
- URL: http://arxiv.org/abs/2011.12100v2
- Date: Thu, 29 Apr 2021 14:46:36 GMT
- Title: GIRAFFE: Representing Scenes as Compositional Generative Neural Feature
Fields
- Authors: Michael Niemeyer, Andreas Geiger
- Abstract summary: Deep generative models allow for photorealistic image synthesis at high resolutions.
But for many applications, this is not enough: content creation also needs to be controllable.
Our key hypothesis is that incorporating a compositional 3D scene representation into the generative model leads to more controllable image synthesis.
- Score: 45.21191307444531
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative models allow for photorealistic image synthesis at high
resolutions. But for many applications, this is not enough: content creation
also needs to be controllable. While several recent works investigate how to
disentangle underlying factors of variation in the data, most of them operate
in 2D and hence ignore that our world is three-dimensional. Further, only few
works consider the compositional nature of scenes. Our key hypothesis is that
incorporating a compositional 3D scene representation into the generative model
leads to more controllable image synthesis. Representing scenes as
compositional generative neural feature fields allows us to disentangle one or
multiple objects from the background as well as individual objects' shapes and
appearances while learning from unstructured and unposed image collections
without any additional supervision. Combining this scene representation with a
neural rendering pipeline yields a fast and realistic image synthesis model. As
evidenced by our experiments, our model is able to disentangle individual
objects and allows for translating and rotating them in the scene as well as
changing the camera pose.
Related papers
- DisCoScene: Spatially Disentangled Generative Radiance Fields for
Controllable 3D-aware Scene Synthesis [90.32352050266104]
DisCoScene is a 3Daware generative model for high-quality and controllable scene synthesis.
It disentangles the whole scene into object-centric generative fields by learning on only 2D images with the global-local discrimination.
We demonstrate state-of-the-art performance on many scene datasets, including the challenging outdoor dataset.
arXiv Detail & Related papers (2022-12-22T18:59:59Z) - gCoRF: Generative Compositional Radiance Fields [80.45269080324677]
3D generative models of objects enable photorealistic image synthesis with 3D control.
Existing methods model the scene as a global scene representation, ignoring the compositional aspect of the scene.
We present a compositional generative model, where each semantic part of the object is represented as an independent 3D representation.
arXiv Detail & Related papers (2022-10-31T14:10:44Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Advances in Neural Rendering [115.05042097988768]
This report focuses on methods that combine classical rendering with learned 3D scene representations.
A key advantage of these methods is that they are 3D-consistent by design, enabling applications such as novel viewpoint of a captured scene.
In addition to methods that handle static scenes, we cover neural scene representations for modeling non-rigidly deforming objects.
arXiv Detail & Related papers (2021-11-10T18:57:01Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.