3DGen: Triplane Latent Diffusion for Textured Mesh Generation
- URL: http://arxiv.org/abs/2303.05371v2
- Date: Mon, 27 Mar 2023 18:04:20 GMT
- Title: 3DGen: Triplane Latent Diffusion for Textured Mesh Generation
- Authors: Anchit Gupta, Wenhan Xiong, Yixin Nie, Ian Jones, Barlas O\u{g}uz
- Abstract summary: A triplane VAE learns latent representations of textured meshes and a conditional diffusion model generates the triplane features.
For the first time this architecture allows conditional and unconditional generation of high quality textured or untextured 3D meshes.
It outperforms previous work substantially on image-conditioned and unconditional generation on mesh quality as well as texture generation.
- Score: 17.178939191534994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Latent diffusion models for image generation have crossed a quality threshold
which enabled them to achieve mass adoption. Recently, a series of works have
made advancements towards replicating this success in the 3D domain,
introducing techniques such as point cloud VAE, triplane representation, neural
implicit surfaces and differentiable rendering based training. We take another
step along this direction, combining these developments in a two-step pipeline
consisting of 1) a triplane VAE which can learn latent representations of
textured meshes and 2) a conditional diffusion model which generates the
triplane features. For the first time this architecture allows conditional and
unconditional generation of high quality textured or untextured 3D meshes
across multiple diverse categories in a few seconds on a single GPU. It
outperforms previous work substantially on image-conditioned and unconditional
generation on mesh quality as well as texture generation. Furthermore, we
demonstrate the scalability of our model to large datasets for increased
quality and diversity. We will release our code and trained models.
Related papers
- MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - 3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors [85.11117452560882]
We present a two-stage text-to-3D generation system, namely 3DTopia, which generates high-quality general 3D assets within 5 minutes using hybrid diffusion priors.
The first stage samples from a 3D diffusion prior directly learned from 3D data. Specifically, it is powered by a text-conditioned tri-plane latent diffusion model, which quickly generates coarse 3D samples for fast prototyping.
The second stage utilizes 2D diffusion priors to further refine the texture of coarse 3D models from the first stage. The refinement consists of both latent and pixel space optimization for high-quality texture generation
arXiv Detail & Related papers (2024-03-04T17:26:28Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Make-A-Shape: a Ten-Million-scale 3D Shape Model [55.34451258972251]
This paper introduces Make-A-Shape, a new 3D generative model designed for efficient training on a vast scale.
We first innovate a wavelet-tree representation to compactly encode shapes by formulating the subband coefficient filtering scheme.
We derive the subband adaptive training strategy to train our model to effectively learn to generate coarse and detail wavelet coefficients.
arXiv Detail & Related papers (2024-01-20T00:21:58Z) - DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors [26.0337715783954]
DiffusionGAN3D boosts text-guided 3D domain adaptation and generation by combining 3D GANs and diffusion priors.
The proposed framework achieves excellent results in both domain adaptation and text-to-avatar tasks.
arXiv Detail & Related papers (2023-12-28T05:46:26Z) - Breathing New Life into 3D Assets with Generative Repainting [74.80184575267106]
Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators.
Recent works have proposed various pipelines powered by the entanglement of diffusion models and neural fields.
We explore the power of pretrained 2D diffusion models and standard 3D neural radiance fields as independent, standalone tools.
Our pipeline accepts any legacy renderable geometry, such as textured or untextured meshes, and orchestrates the interaction between 2D generative refinement and 3D consistency enforcement tools.
arXiv Detail & Related papers (2023-09-15T16:34:51Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z) - 3D Neural Field Generation using Triplane Diffusion [37.46688195622667]
We present an efficient diffusion-based model for 3D-aware generation of neural fields.
Our approach pre-processes training data, such as ShapeNet meshes, by converting them to continuous occupancy fields.
We demonstrate state-of-the-art results on 3D generation on several object classes from ShapeNet.
arXiv Detail & Related papers (2022-11-30T01:55:52Z) - Convolutional Generation of Textured 3D Meshes [34.20939983046376]
We propose a framework that can generate triangle meshes and associated high-resolution texture maps, using only 2D supervision from single-view natural images.
A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN.
We demonstrate the efficacy of our method on Pascal3D+ Cars and CUB, both in an unconditional setting and in settings where the model is conditioned on class labels, attributes, and text.
arXiv Detail & Related papers (2020-06-13T15:23:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.