Related papers: FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation

FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation

URL: http://arxiv.org/abs/2603.01515v2
Date: Tue, 03 Mar 2026 10:12:47 GMT
Title: FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation
Authors: Hanxiao Wang, Yuan-Chen Guo, Ying-Tian Liu, Zi-Xin Zou, Biao Zhang, Weize Quan, Ding Liang, Yan-Pei Cao, Dong-Ming Yan,
Abstract summary: We introduce FACE, a novel Autoregressive Autoencoder framework that generates meshes at the face level.<n>Our one-face-one-token strategy treats each triangle face, the fundamental building block of a mesh, as a single, unified token.<n> FACE achieves state-of-the-art reconstruction quality on standard benchmarks.
Score: 50.71369329585773
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autoregressive models for 3D mesh generation suffer from a fundamental limitation: they flatten meshes into long vertex-coordinate sequences. This results in prohibitive computational costs, hindering the efficient synthesis of high-fidelity geometry. We argue this bottleneck stems from operating at the wrong semantic level. We introduce FACE, a novel Autoregressive Autoencoder (ARAE) framework that reconceptualizes the task by generating meshes at the face level. Our one-face-one-token strategy treats each triangle face, the fundamental building block of a mesh, as a single, unified token. This simple yet powerful design reduces the sequence length by a factor of nine, leading to an unprecedented compression ratio of 0.11, halving the previous state-of-the-art. This dramatic efficiency gain does not compromise quality; by pairing our face-level decoder with a powerful VecSet encoder, FACE achieves state-of-the-art reconstruction quality on standard benchmarks. The versatility of the learned latent space is further demonstrated by training a latent diffusion model that achieves high-fidelity, single-image-to-mesh generation. FACE provides a simple, scalable, and powerful paradigm that lowers the barrier to high-quality structured 3D content creation.

Related papers

HiFi-Mesh: High-Fidelity Efficient 3D Mesh Generation via Compact Autoregressive Dependence [36.403921772528236]
We introduce the Latent Autoregressive Network (LANE), which incorporates compact autoregressive dependencies in the generation process.<n>LANE achieves a $6times$ improvement in maximum sequence length compared to existing methods.
arXiv Detail & Related papers (2026-01-29T06:22:26Z)
LATTICE: Democratize High-Fidelity 3D Generation at Scale [27.310104395842075]
LATTICE is a new framework for high-fidelity 3D asset generation.<n> VoxSet is a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid.<n>Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes.
arXiv Detail & Related papers (2025-11-24T03:22:19Z)
FlashMesh: Faster and Better Autoregressive Mesh Synthesis via Structured Speculation [65.3277633028397]
FlashMesh is a fast and high-fidelity mesh generation framework.<n>We show that FlashMesh achieves up to a 2 x speedup over standard autoregressive models.
arXiv Detail & Related papers (2025-11-19T17:03:49Z)
Topology Sculptor, Shape Refiner: Discrete Diffusion Model for High-Fidelity 3D Meshes Generation [14.55646181682844]
Topology Sculptor, Shape Refiner (TSSR) is a novel method for generating high-quality, artist-style 3D meshes.<n>We leverage this parallel generation capability through three key innovations.<n> experiments on complex datasets demonstrate that TSSR generates high-quality 3D artist-style meshes.
arXiv Detail & Related papers (2025-10-24T08:51:48Z)
FastMesh: Efficient Artistic Mesh Generation via Component Decoupling [27.21354509059262]
Mesh generation approaches typically tokenize triangle meshes into sequences of tokens and train autoregressive models to generate these tokens sequentially.<n>This redundancy leads to excessively long token sequences and inefficient generation processes.<n>We propose an efficient framework that generates artistic meshes by treating vertices and faces separately.
arXiv Detail & Related papers (2025-08-26T16:51:02Z)
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation [52.261584726401686]
We present a novel direction to build an image tokenizer directly on top of a frozen vision foundation model.<n>Based on these designs, our proposed image tokenizer, VFMTok, achieves substantial improvements in image reconstruction and generation quality.
arXiv Detail & Related papers (2025-07-11T09:32:45Z)
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation [81.58846231702026]
We introduce GigaTok, the first approach to improve image reconstruction, generation, and representation learning when scaling visual tokenizers.<n>We identify the growing complexity of latent space as the key factor behind the reconstruction vs. generation dilemma.<n>By scaling to $bf3 space billion$ parameters, GigaTok achieves state-of-the-art performance in reconstruction, downstream AR generation, and downstream AR representation quality.
arXiv Detail & Related papers (2025-04-11T17:59:58Z)
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs [79.45006864728893]
MeshCraft is a framework for efficient and controllable mesh generation.<n>It uses continuous spatial diffusion to generate discrete triangle faces.<n>It can generate an 800-face mesh in just 3.2 seconds.
arXiv Detail & Related papers (2025-03-29T09:21:50Z)
TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing [47.919057306538626]
TreeMeshGPT is an autoregressive Transformer designed to generate artistic meshes aligned with input point clouds.<n>Our approach represents each triangular face with two tokens, achieving a compression rate of approximately 22%.<n>Our method generates mesh with strong normal orientation constraints, minimizing flipped normals commonly encountered in previous methods.
arXiv Detail & Related papers (2025-03-14T17:48:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.