A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation
- URL: http://arxiv.org/abs/2503.06485v1
- Date: Sun, 09 Mar 2025 07:05:29 GMT
- Title: A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation
- Authors: Jiajie Fan, Amal Trigui, Andrea Bonfanti, Felix Dietrich, Thomas Bäck, Hao Wang,
- Abstract summary: This paper introduces a novel framework, spectral-domain diffusion for high-quality shape generation SpoDify.<n>It efficiently encodes complex meshes into continuous implicit representations, such as encoding a 15k-vertex mesh to a 512-dimensional latent code without learning.
- Score: 4.064004858393506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in learning latent codes derived from high-dimensional shapes have demonstrated impressive outcomes in 3D generative modeling. Traditionally, these approaches employ a trained autoencoder to acquire a continuous implicit representation of source shapes, which can be computationally expensive. This paper introduces a novel framework, spectral-domain diffusion for high-quality shape generation SpoDify, that utilizes singular value decomposition (SVD) for shape encoding. The resulting eigenvectors can be stored for subsequent decoding, while generative modeling is performed on the eigenfeatures. This approach efficiently encodes complex meshes into continuous implicit representations, such as encoding a 15k-vertex mesh to a 512-dimensional latent code without learning. Our method exhibits significant advantages in scenarios with limited samples or GPU resources. In mesh generation tasks, our approach produces high-quality shapes that are comparable to state-of-the-art methods.
Related papers
- OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation [24.980804600194062]
OctGPT is a novel multiscale autoregressive model for 3D shape generation.
It dramatically improves the efficiency and performance of prior 3D autoregressive approaches.
It offers a new paradigm for high-quality, scalable 3D content creation.
arXiv Detail & Related papers (2025-04-14T08:31:26Z) - Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization [68.07464514094299]
Existing methods encode all shapes into a fixed-size token, disregarding the inherent variations in scale and complexity across 3D data.
We introduce Octree-based Adaptive Tokenization, a novel framework that adjusts the dimension of latent representations according to shape complexity.
Our approach reduces token counts by 50% compared to fixed-size methods while maintaining comparable visual quality.
arXiv Detail & Related papers (2025-04-03T17:57:52Z) - EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation [36.69567056569989]
We propose an Auto-regressive Auto-encoder (ArAE) model capable of generating high-quality 3D meshes with up to 4,000 faces at a spatial resolution of $5123$.
We introduce a novel mesh tokenization algorithm that efficiently compresses triangular meshes into 1D token sequences, significantly enhancing training efficiency.
Our model compresses variable-length triangular meshes into a fixed-length latent space, enabling training latent diffusion models for better generalization.
arXiv Detail & Related papers (2024-09-26T17:55:02Z) - Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds [6.69660410213287]
We propose an innovative framework called Point-MGE to explore the benefits of deeply integrating 3D representation learning and generative learning.
In shape classification, Point-MGE achieved an accuracy of 94.2% (+1.0%) on the ModelNet40 dataset and 92.9% (+5.5%) on the ScanObjectNN dataset.
Experimental results also confirmed that Point-MGE can generate high-quality 3D shapes in both unconditional and conditional settings.
arXiv Detail & Related papers (2024-06-25T07:57:03Z) - MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields [50.12118098874321]
We introduce a latent 3D diffusion process for neural voxel fields, enabling generation at significantly higher resolutions.
A part-aware shape decoder is introduced to integrate the part codes into the neural voxel fields, guiding the accurate part decomposition.
The results demonstrate the superior generative capabilities of our proposed method in part-aware shape generation, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2024-05-02T04:31:17Z) - Make-A-Shape: a Ten-Million-scale 3D Shape Model [52.701745578415796]
This paper introduces Make-A-Shape, a new 3D generative model designed for efficient training on a vast scale.
We first innovate a wavelet-tree representation to compactly encode shapes by formulating the subband coefficient filtering scheme.
We derive the subband adaptive training strategy to train our model to effectively learn to generate coarse and detail wavelet coefficients.
arXiv Detail & Related papers (2024-01-20T00:21:58Z) - Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models [83.35835521670955]
Surf-D is a novel method for generating high-quality 3D shapes as Surfaces with arbitrary topologies.
We use the Unsigned Distance Field (UDF) as our surface representation to accommodate arbitrary topologies.
We also propose a new pipeline that employs a point-based AutoEncoder to learn a compact and continuous latent space for accurately encoding UDF.
arXiv Detail & Related papers (2023-11-28T18:56:01Z) - Autoregressive 3D Shape Generation via Canonical Mapping [92.91282602339398]
transformers have shown remarkable performances in a variety of generative tasks such as image, audio, and text generation.
In this paper, we aim to further exploit the power of transformers and employ them for the task of 3D point cloud generation.
Our model can be easily extended to multi-modal shape completion as an application for conditional shape generation.
arXiv Detail & Related papers (2022-04-05T03:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.