Make-A-Shape: a Ten-Million-scale 3D Shape Model
- URL: http://arxiv.org/abs/2401.11067v1
- Date: Sat, 20 Jan 2024 00:21:58 GMT
- Title: Make-A-Shape: a Ten-Million-scale 3D Shape Model
- Authors: Ka-Hei Hui, Aditya Sanghi, Arianna Rampini, Kamal Rahimi Malekshan,
Zhengzhe Liu, Hooman Shayani, Chi-Wing Fu
- Abstract summary: This paper introduces Make-A-Shape, a new 3D generative model designed for efficient training on a vast scale.
We first innovate a wavelet-tree representation to compactly encode shapes by formulating the subband coefficient filtering scheme.
We derive the subband adaptive training strategy to train our model to effectively learn to generate coarse and detail wavelet coefficients.
- Score: 55.34451258972251
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Significant progress has been made in training large generative models for
natural language and images. Yet, the advancement of 3D generative models is
hindered by their substantial resource demands for training, along with
inefficient, non-compact, and less expressive representations. This paper
introduces Make-A-Shape, a new 3D generative model designed for efficient
training on a vast scale, capable of utilizing 10 millions publicly-available
shapes. Technical-wise, we first innovate a wavelet-tree representation to
compactly encode shapes by formulating the subband coefficient filtering scheme
to efficiently exploit coefficient relations. We then make the representation
generatable by a diffusion model by devising the subband coefficients packing
scheme to layout the representation in a low-resolution grid. Further, we
derive the subband adaptive training strategy to train our model to effectively
learn to generate coarse and detail wavelet coefficients. Last, we extend our
framework to be controlled by additional input conditions to enable it to
generate shapes from assorted modalities, e.g., single/multi-view images, point
clouds, and low-resolution voxels. In our extensive set of experiments, we
demonstrate various applications, such as unconditional generation, shape
completion, and conditional generation on a wide range of modalities. Our
approach not only surpasses the state of the art in delivering high-quality
results but also efficiently generates shapes within a few seconds, often
achieving this in just 2 seconds for most conditions.
Related papers
- Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Pushing the Limits of 3D Shape Generation at Scale [65.24420181727615]
We present a significant breakthrough in 3D shape generation by scaling it to unprecedented dimensions.
We have developed a model with an astounding 3.6 billion trainable parameters, establishing it as the largest 3D shape generation model to date, named Argus-3D.
arXiv Detail & Related papers (2023-06-20T13:01:19Z) - Few-shot 3D Shape Generation [18.532357455856836]
We make the first attempt to realize few-shot 3D shape generation by adapting generative models pre-trained on large source domains to target domains using limited data.
Our approach only needs the silhouettes of few-shot target samples as training data to learn target geometry distributions.
arXiv Detail & Related papers (2023-05-19T13:30:10Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z) - MeshDiffusion: Score-based Generative 3D Mesh Modeling [68.40770889259143]
We consider the task of generating realistic 3D shapes for automatic scene generation and physical simulation.
We take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes.
Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parametrization.
arXiv Detail & Related papers (2023-03-14T17:59:01Z) - 3D Neural Field Generation using Triplane Diffusion [37.46688195622667]
We present an efficient diffusion-based model for 3D-aware generation of neural fields.
Our approach pre-processes training data, such as ShapeNet meshes, by converting them to continuous occupancy fields.
We demonstrate state-of-the-art results on 3D generation on several object classes from ShapeNet.
arXiv Detail & Related papers (2022-11-30T01:55:52Z) - Neural Wavelet-domain Diffusion for 3D Shape Generation [52.038346313823524]
This paper presents a new approach for 3D shape generation, enabling direct generative modeling on a continuous implicit representation in wavelet domain.
Specifically, we propose a compact wavelet representation with a pair of coarse and detail coefficient volumes to implicitly represent 3D shapes via truncated signed distance functions and multi-scale biorthogonal wavelets.
arXiv Detail & Related papers (2022-09-19T02:51:48Z) - 3DILG: Irregular Latent Grids for 3D Generative Modeling [44.16807313707137]
We propose a new representation for encoding 3D shapes as neural fields.
The representation is designed to be compatible with the transformer architecture and to benefit both shape reconstruction and shape generation.
arXiv Detail & Related papers (2022-05-27T11:29:52Z) - Discrete Point Flow Networks for Efficient Point Cloud Generation [36.03093265136374]
Generative models have proven effective at modeling 3D shapes and their statistical variations.
We introduce a latent variable model that builds on normalizing flows to generate 3D point clouds of an arbitrary size.
For single-view shape reconstruction we also obtain results on par with state-of-the-art voxel, point cloud, and mesh-based methods.
arXiv Detail & Related papers (2020-07-20T14:48:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.