Disentangled3D: Learning a 3D Generative Model with Disentangled
Geometry and Appearance from Monocular Images
- URL: http://arxiv.org/abs/2203.15926v1
- Date: Tue, 29 Mar 2022 22:03:18 GMT
- Title: Disentangled3D: Learning a 3D Generative Model with Disentangled
Geometry and Appearance from Monocular Images
- Authors: Ayush Tewari, Mallikarjun B R, Xingang Pan, Ohad Fried, Maneesh
Agrawala, Christian Theobalt
- Abstract summary: State-of-the-art 3D generative models are GANs which use neural 3D volumetric representations for synthesis.
In this paper, we design a 3D GAN which can learn a disentangled model of objects, just from monocular observations.
- Score: 94.49117671450531
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning 3D generative models from a dataset of monocular images enables
self-supervised 3D reasoning and controllable synthesis. State-of-the-art 3D
generative models are GANs which use neural 3D volumetric representations for
synthesis. Images are synthesized by rendering the volumes from a given camera.
These models can disentangle the 3D scene from the camera viewpoint in any
generated image. However, most models do not disentangle other factors of image
formation, such as geometry and appearance. In this paper, we design a 3D GAN
which can learn a disentangled model of objects, just from monocular
observations. Our model can disentangle the geometry and appearance variations
in the scene, i.e., we can independently sample from the geometry and
appearance spaces of the generative model. This is achieved using a novel
non-rigid deformable scene formulation. A 3D volume which represents an object
instance is computed as a non-rigidly deformed canonical 3D volume. Our method
learns the canonical volume, as well as its deformations, jointly during
training. This formulation also helps us improve the disentanglement between
the 3D scene and the camera viewpoints using a novel pose regularization loss
defined on the 3D deformation field. In addition, we further model the inverse
deformations, enabling the computation of dense correspondences between images
generated by our model. Finally, we design an approach to embed real images
into the latent space of our disentangled generative model, enabling editing of
real images.
Related papers
- 3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis [49.352765055181436]
We propose a 3D geometry-aware deformable Gaussian Splatting method for dynamic view synthesis.
Our solution achieves 3D geometry-aware deformation modeling, which enables improved dynamic view synthesis and 3D dynamic reconstruction.
arXiv Detail & Related papers (2024-04-09T12:47:30Z) - WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space [77.92350895927922]
We propose WildFusion, a new approach to 3D-aware image synthesis based on latent diffusion models (LDMs)
Our 3D-aware LDM is trained without any direct supervision from multiview images or 3D geometry.
This opens up promising research avenues for scalable 3D-aware image synthesis and 3D content creation from in-the-wild image data.
arXiv Detail & Related papers (2023-11-22T18:25:51Z) - 3inGAN: Learning a 3D Generative Model from Images of a Self-similar
Scene [34.2144933185175]
3inGAN is an unconditional 3D generative model trained from 2D images of a single self-similar 3D scene.
We show results on semi-stochastic scenes of varying scale and complexity, obtained from real and synthetic sources.
arXiv Detail & Related papers (2022-11-27T18:03:21Z) - Next3D: Generative Neural Texture Rasterization for 3D-Aware Head
Avatars [36.4402388864691]
3D-aware generative adversarial networks (GANs) synthesize high-fidelity and multi-view-consistent facial images using only collections of single-view 2D imagery.
Recent efforts incorporate 3D Morphable Face Model (3DMM) to describe deformation in generative radiance fields either explicitly or implicitly.
We propose a novel 3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images.
arXiv Detail & Related papers (2022-11-21T06:40:46Z) - Generative Deformable Radiance Fields for Disentangled Image Synthesis
of Topology-Varying Objects [52.46838926521572]
3D-aware generative models have demonstrated their superb performance to generate 3D neural radiance fields (NeRF) from a collection of monocular 2D images.
We propose a generative model for synthesizing radiance fields of topology-varying objects with disentangled shape and appearance variations.
arXiv Detail & Related papers (2022-09-09T08:44:06Z) - Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape
Laplacian [58.704089101826774]
We present a 3D-aware image deformation method with minimal restrictions on shape category and deformation type.
We take a supervised learning-based approach to predict the shape Laplacian of the underlying volume of a 3D reconstruction represented as a point cloud.
In the experiments, we present our results of deforming 2D character and clothed human images.
arXiv Detail & Related papers (2022-03-29T04:57:18Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Building 3D Morphable Models from a Single Scan [3.472931603805115]
We propose a method for constructing generative models of 3D objects from a single 3D mesh.
Our method produces a 3D morphable model that represents shape and albedo in terms of Gaussian processes.
We show that our approach can be used to perform face recognition using only a single 3D scan.
arXiv Detail & Related papers (2020-11-24T23:08:14Z) - Cycle-Consistent Generative Rendering for 2D-3D Modality Translation [21.962725416347855]
We learn a module that generates a realistic rendering of a 3D object and infers a realistic 3D shape from an image.
By leveraging generative domain translation methods, we are able to define a learning algorithm that requires only weak supervision, with unpaired data.
The resulting model is able to perform 3D shape, pose, and texture inference from 2D images, but can also generate novel textured 3D shapes and renders.
arXiv Detail & Related papers (2020-11-16T15:23:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.