Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
- URL: http://arxiv.org/abs/2209.15637v1
- Date: Fri, 30 Sep 2022 17:59:37 GMT
- Title: Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
- Authors: Zifan Shi, Yinghao Xu, Yujun Shen, Deli Zhao, Qifeng Chen, Dit-Yan
Yeung
- Abstract summary: 3D-aware image synthesis aims at learning a generative model that can render photo-realistic 2D images while capturing decent underlying 3D shapes.
Existing methods fail to obtain moderate 3D shapes.
We propose a geometry-aware discriminator to improve 3D-aware GANs.
- Score: 68.0533826852601
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D-aware image synthesis aims at learning a generative model that can render
photo-realistic 2D images while capturing decent underlying 3D shapes. A
popular solution is to adopt the generative adversarial network (GAN) and
replace the generator with a 3D renderer, where volume rendering with neural
radiance field (NeRF) is commonly used. Despite the advancement of synthesis
quality, existing methods fail to obtain moderate 3D shapes. We argue that,
considering the two-player game in the formulation of GANs, only making the
generator 3D-aware is not enough. In other words, displacing the generative
mechanism only offers the capability, but not the guarantee, of producing
3D-aware images, because the supervision of the generator primarily comes from
the discriminator. To address this issue, we propose GeoD through learning a
geometry-aware discriminator to improve 3D-aware GANs. Concretely, besides
differentiating real and fake samples from the 2D image space, the
discriminator is additionally asked to derive the geometry information from the
inputs, which is then applied as the guidance of the generator. Such a simple
yet effective design facilitates learning substantially more accurate 3D
shapes. Extensive experiments on various generator architectures and training
datasets verify the superiority of GeoD over state-of-the-art alternatives.
Moreover, our approach is registered as a general framework such that a more
capable discriminator (i.e., with a third task of novel view synthesis beyond
domain classification and geometry extraction) can further assist the generator
with a better multi-view consistency.
Related papers
- GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality
3D Generation [96.32684334038278]
In this paper, we explore the design space of text-to-3D models.
We significantly improve multi-view generation by considering video instead of image generators.
Our new method, IM-3D, reduces the number of evaluations of the 2D generator network 10-100x.
arXiv Detail & Related papers (2024-02-13T18:59:51Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z) - XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z) - 3D-Aware Indoor Scene Synthesis with Depth Priors [62.82867334012399]
Existing methods fail to model indoor scenes due to the large diversity of room layouts and the objects inside.
We argue that indoor scenes do not have a shared intrinsic structure, and hence only using 2D images cannot adequately guide the model with the 3D geometry.
arXiv Detail & Related papers (2022-02-17T09:54:29Z) - 3D-aware Image Synthesis via Learning Structural and Textural
Representations [39.681030539374994]
We propose VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation.
Our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
arXiv Detail & Related papers (2021-12-20T18:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.