BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled
Images
- URL: http://arxiv.org/abs/2002.08988v4
- Date: Wed, 2 Dec 2020 11:57:32 GMT
- Title: BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled
Images
- Authors: Thu Nguyen-Phuoc, Christian Richardt, Long Mai, Yong-Liang Yang, Niloy
Mitra
- Abstract summary: We present BlockGAN, an image generative model that learns object-aware 3D scene representations directly from unlabelled 2D images.
Inspired by the computer graphics pipeline, we design BlockGAN to learn to first generate 3D features of background and foreground objects, then combine them into 3D features for the wholes cene.
- Score: 38.952307525311625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present BlockGAN, an image generative model that learns object-aware 3D
scene representations directly from unlabelled 2D images. Current work on scene
representation learning either ignores scene background or treats the whole
scene as one object. Meanwhile, work that considers scene compositionality
treats scene objects only as image patches or 2D layers with alpha maps.
Inspired by the computer graphics pipeline, we design BlockGAN to learn to
first generate 3D features of background and foreground objects, then combine
them into 3D features for the wholes cene, and finally render them into
realistic images. This allows BlockGAN to reason over occlusion and interaction
between objects' appearance, such as shadow and lighting, and provides control
over each object's 3D pose and identity, while maintaining image realism.
BlockGAN is trained end-to-end, using only unlabelled single images, without
the need for 3D geometry, pose labels, object masks, or multiple views of the
same scene. Our experiments show that using explicit 3D features to represent
objects allows BlockGAN to learn disentangled representations both in terms of
objects (foreground and background) and their properties (pose and identity).
Related papers
- Disentangled 3D Scene Generation with Layout Learning [109.03233745767062]
We introduce a method to generate 3D scenes that are disentangled into their component objects.
Our key insight is that objects can be discovered by finding parts of a 3D scene that, when rearranged spatially, still produce valid configurations of the same scene.
We show that despite its simplicity, our approach successfully generates 3D scenes into individual objects.
arXiv Detail & Related papers (2024-02-26T18:54:15Z) - SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections [49.802462165826554]
We present SceneDreamer, an unconditional generative model for unbounded 3D scenes.
Our framework is learned from in-the-wild 2D image collections only, without any 3D annotations.
arXiv Detail & Related papers (2023-02-02T18:59:16Z) - DisCoScene: Spatially Disentangled Generative Radiance Fields for
Controllable 3D-aware Scene Synthesis [90.32352050266104]
DisCoScene is a 3Daware generative model for high-quality and controllable scene synthesis.
It disentangles the whole scene into object-centric generative fields by learning on only 2D images with the global-local discrimination.
We demonstrate state-of-the-art performance on many scene datasets, including the challenging outdoor dataset.
arXiv Detail & Related papers (2022-12-22T18:59:59Z) - gCoRF: Generative Compositional Radiance Fields [80.45269080324677]
3D generative models of objects enable photorealistic image synthesis with 3D control.
Existing methods model the scene as a global scene representation, ignoring the compositional aspect of the scene.
We present a compositional generative model, where each semantic part of the object is represented as an independent 3D representation.
arXiv Detail & Related papers (2022-10-31T14:10:44Z) - Volumetric Disentanglement for 3D Scene Manipulation [22.22326242219791]
We propose a volumetric framework for disentangling or separating, the volumetric representation of a given foreground object from the background, and semantically manipulating the foreground object, as well as the background.
Our framework takes as input a set of 2D masks specifying the desired foreground object for training views, together with the associated 2D views and poses, and produces a foreground-background disentanglement.
We subsequently demonstrate the applicability of our framework on a number of downstream manipulation tasks including object camouflage, non-negative 3D object inpainting, 3D object translation, 3D object inpainting, and 3D text-based
arXiv Detail & Related papers (2022-06-06T17:57:07Z) - Disentangling 3D Prototypical Networks For Few-Shot Concept Learning [29.02523358573336]
We present neural architectures that disentangle RGB-D images into objects' shapes and styles and a map of the background scene.
Our networks incorporate architectural biases that reflect the image formation process, 3D geometry of the world scene, and shape-style interplay.
arXiv Detail & Related papers (2020-11-06T14:08:27Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z) - ROOTS: Object-Centric Representation and Rendering of 3D Scenes [28.24758046060324]
A crucial ability of human intelligence is to build up models of individual 3D objects from partial scene observations.
Recent works achieve object-centric generation but without the ability to infer the representation.
We propose a probabilistic generative model for learning to build modular and compositional 3D object models.
arXiv Detail & Related papers (2020-06-11T00:42:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.