Self-supervised Learning for Enhancing Geometrical Modeling in 3D-Aware
Generative Adversarial Network
- URL: http://arxiv.org/abs/2312.11856v1
- Date: Tue, 19 Dec 2023 04:55:33 GMT
- Title: Self-supervised Learning for Enhancing Geometrical Modeling in 3D-Aware
Generative Adversarial Network
- Authors: Jiarong Guo, Xiaogang Xu, Hengshuang Zhao
- Abstract summary: 3D-GANs exhibit artifacts in their 3D geometrical modeling, such as mesh imperfections and holes.
These shortcomings are primarily attributed to the limited availability of annotated 3D data.
We present a Self-Supervised Learning technique tailored as an auxiliary loss for any 3D-GAN.
- Score: 42.16520614686877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D-aware Generative Adversarial Networks (3D-GANs) currently exhibit
artifacts in their 3D geometrical modeling, such as mesh imperfections and
holes. These shortcomings are primarily attributed to the limited availability
of annotated 3D data, leading to a constrained "valid latent area" for
satisfactory modeling. To address this, we present a Self-Supervised Learning
(SSL) technique tailored as an auxiliary loss for any 3D-GAN, designed to
improve its 3D geometrical modeling capabilities. Our approach pioneers an
inversion technique for 3D-GANs, integrating an encoder that performs adaptive
spatially-varying range operations. Utilizing this inversion, we introduce the
Cyclic Generative Constraint (CGC), aiming to densify the valid latent space.
The CGC operates via augmented local latent vectors that maintain the same
geometric form, and it imposes constraints on the cycle path outputs,
specifically the generator-encoder-generator sequence. This SSL methodology
seamlessly integrates with the inherent GAN loss, ensuring the integrity of
pre-existing 3D-GAN architectures without necessitating alterations. We
validate our approach with comprehensive experiments across various datasets
and architectures, underscoring its efficacy. Our project website:
https://3dgan-ssl.github.io
Related papers
- Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation [27.43973967994717]
MT3D is a text-to-3D generative model that leverages a high-fidelity 3D object to overcome viewpoint bias.
We employ depth maps derived from a high-quality 3D model as control signals to guarantee that the generated 2D images preserve the fundamental shape and structure.
By incorporating geometric details from a 3D asset, MT3D enables the creation of diverse and geometrically consistent objects.
arXiv Detail & Related papers (2024-08-12T06:25:44Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation [73.36690511083894]
This paper introduces a novel framework called LN3Diff to address a unified 3D diffusion pipeline.
Our approach harnesses a 3D-aware architecture and variational autoencoder to encode the input image into a structured, compact, and 3D latent space.
It achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation.
arXiv Detail & Related papers (2024-03-18T17:54:34Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z) - Make Encoder Great Again in 3D GAN Inversion through Geometry and
Occlusion-Aware Encoding [25.86312557482366]
3D GAN inversion aims to achieve high reconstruction fidelity and reasonable 3D geometry simultaneously from a single image input.
We introduce a novel encoder-based inversion framework based on EG3D, one of the most widely-used 3D GAN models.
Our method achieves impressive results comparable to optimization-based methods while operating up to 500 times faster.
arXiv Detail & Related papers (2023-03-22T05:51:53Z) - Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator [68.0533826852601]
3D-aware image synthesis aims at learning a generative model that can render photo-realistic 2D images while capturing decent underlying 3D shapes.
Existing methods fail to obtain moderate 3D shapes.
We propose a geometry-aware discriminator to improve 3D-aware GANs.
arXiv Detail & Related papers (2022-09-30T17:59:37Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.