EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
- URL: http://arxiv.org/abs/2409.18114v1
- Date: Thu, 26 Sep 2024 17:55:02 GMT
- Title: EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
- Authors: Jiaxiang Tang, Zhaoshuo Li, Zekun Hao, Xian Liu, Gang Zeng, Ming-Yu Liu, Qinsheng Zhang,
- Abstract summary: We propose an Auto-regressive Auto-encoder (ArAE) model capable of generating high-quality 3D meshes with up to 4,000 faces at a spatial resolution of $5123$.
We introduce a novel mesh tokenization algorithm that efficiently compresses triangular meshes into 1D token sequences, significantly enhancing training efficiency.
Our model compresses variable-length triangular meshes into a fixed-length latent space, enabling training latent diffusion models for better generalization.
- Score: 36.69567056569989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current auto-regressive mesh generation methods suffer from issues such as incompleteness, insufficient detail, and poor generalization. In this paper, we propose an Auto-regressive Auto-encoder (ArAE) model capable of generating high-quality 3D meshes with up to 4,000 faces at a spatial resolution of $512^3$. We introduce a novel mesh tokenization algorithm that efficiently compresses triangular meshes into 1D token sequences, significantly enhancing training efficiency. Furthermore, our model compresses variable-length triangular meshes into a fixed-length latent space, enabling training latent diffusion models for better generalization. Extensive experiments demonstrate the superior quality, diversity, and generalization capabilities of our model in both point cloud and image-conditioned mesh generation tasks.
Related papers
- OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation [24.980804600194062]
OctGPT is a novel multiscale autoregressive model for 3D shape generation.
It dramatically improves the efficiency and performance of prior 3D autoregressive approaches.
It offers a new paradigm for high-quality, scalable 3D content creation.
arXiv Detail & Related papers (2025-04-14T08:31:26Z) - 3D Point Cloud Generation via Autoregressive Up-sampling [60.05226063558296]
We introduce a pioneering autoregressive generative model for 3D point cloud generation.
Inspired by visual autoregressive modeling, we conceptualize point cloud generation as an autoregressive up-sampling process.
PointARU progressively refines 3D point clouds from coarse to fine scales.
arXiv Detail & Related papers (2025-03-11T16:30:45Z) - A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation [4.064004858393506]
This paper introduces a novel framework, spectral-domain diffusion for high-quality shape generation SpoDify.
It efficiently encodes complex meshes into continuous implicit representations, such as encoding a 15k-vertex mesh to a 512-dimensional latent code without learning.
arXiv Detail & Related papers (2025-03-09T07:05:29Z) - MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance [66.40153183581894]
We introduce a generic and scalable mesh generation framework PivotMesh.
PivotMesh makes an initial attempt to extend the native mesh generation to large-scale datasets.
We show that PivotMesh can generate compact and sharp 3D meshes across various categories.
arXiv Detail & Related papers (2024-05-27T07:13:13Z) - Compress3D: a Compressed Latent Space for 3D Generation from a Single Image [27.53099431097921]
Triplane autoencoder encodes 3D models into a compact triplane latent space to compress both the 3D geometry and texture information.
We introduce a 3D-aware cross-attention mechanism, which utilizes low-resolution latent representations to query features from a high-resolution 3D feature volume.
Our approach enables the generation of high-quality 3D assets in merely 7 seconds on a single A100 GPU.
arXiv Detail & Related papers (2024-03-20T11:51:04Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Make-A-Shape: a Ten-Million-scale 3D Shape Model [52.701745578415796]
This paper introduces Make-A-Shape, a new 3D generative model designed for efficient training on a vast scale.
We first innovate a wavelet-tree representation to compactly encode shapes by formulating the subband coefficient filtering scheme.
We derive the subband adaptive training strategy to train our model to effectively learn to generate coarse and detail wavelet coefficients.
arXiv Detail & Related papers (2024-01-20T00:21:58Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.