XDGAN: Multi-Modal 3D Shape Generation in 2D Space
- URL: http://arxiv.org/abs/2210.03007v1
- Date: Thu, 6 Oct 2022 15:54:01 GMT
- Title: XDGAN: Multi-Modal 3D Shape Generation in 2D Space
- Authors: Hassan Abu Alhaija, Alara Dirik, Andr\'e Kn\"orig, Sanja Fidler, Maria
Shugrina
- Abstract summary: We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
- Score: 60.46777591995821
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Generative models for 2D images has recently seen tremendous progress in
quality, resolution and speed as a result of the efficiency of 2D convolutional
architectures. However it is difficult to extend this progress into the 3D
domain since most current 3D representations rely on custom network components.
This paper addresses a central question: Is it possible to directly leverage 2D
image generative models to generate 3D shapes instead? To answer this, we
propose XDGAN, an effective and fast method for applying 2D image GAN
architectures to the generation of 3D object geometry combined with additional
surface attributes, like color textures and normals. Specifically, we propose a
novel method to convert 3D shapes into compact 1-channel geometry images and
leverage StyleGAN3 and image-to-image translation networks to generate 3D
objects in 2D space. The generated geometry images are quick to convert to 3D
meshes, enabling real-time 3D object synthesis, visualization and interactive
editing. Moreover, the use of standard 2D architectures can help bring more 2D
advances into the 3D realm. We show both quantitatively and qualitatively that
our method is highly effective at various tasks such as 3D shape generation,
single view reconstruction and shape manipulation, while being significantly
faster and more flexible compared to recent 3D generative models.
Related papers
- Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation [2.3213238782019316]
GIMDiffusion is a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images.
We exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion.
In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models.
arXiv Detail & Related papers (2024-09-05T17:21:54Z) - ConDense: Consistent 2D/3D Pre-training for Dense and Sparse Features from Multi-View Images [47.682942867405224]
ConDense is a framework for 3D pre-training utilizing existing 2D networks and large-scale multi-view datasets.
We propose a novel 2D-3D joint training scheme to extract co-embedded 2D and 3D features in an end-to-end pipeline.
arXiv Detail & Related papers (2024-08-30T05:57:01Z) - DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation [29.959223778769513]
We propose a novel learning strategy, namely 3D-to-2D imitation, which enables a 3D-aware GAN to generate high-quality images.
We also introduce 3D-aware convolutions into the generator for better 3D representation learning.
With the above strategies, our method reaches FID scores of 5.4 and 4.3 on FFHQ and AFHQ-v2 Cats, respectively, at 512x512 resolution.
arXiv Detail & Related papers (2023-03-16T02:18:41Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - Learning geometry-image representation for 3D point cloud generation [5.3485743892868545]
We propose a novel geometry image based generator (GIG) to convert the 3D point cloud generation problem to a 2D geometry image generation problem.
Experiments on both rigid and non-rigid 3D object datasets have demonstrated the promising performance of our method.
arXiv Detail & Related papers (2020-11-29T05:21:10Z) - Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D
Image GANs [156.1209884183522]
State-of-the-art 2D generative models like GANs show unprecedented quality in modeling the natural image manifold.
We present the first attempt to directly mine 3D geometric cues from an off-the-shelf 2D GAN that is trained on RGB images only.
arXiv Detail & Related papers (2020-11-02T09:38:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.