LATTICE: Democratize High-Fidelity 3D Generation at Scale
- URL: http://arxiv.org/abs/2512.03052v1
- Date: Mon, 24 Nov 2025 03:22:19 GMT
- Title: LATTICE: Democratize High-Fidelity 3D Generation at Scale
- Authors: Zeqiang Lai, Yunfei Zhao, Zibo Zhao, Haolin Liu, Qingxiang Lin, Jingwei Huang, Chunchao Guo, Xiangyu Yue,
- Abstract summary: LATTICE is a new framework for high-fidelity 3D asset generation.<n> VoxSet is a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid.<n>Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes.
- Score: 27.310104395842075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present LATTICE, a new framework for high-fidelity 3D asset generation that bridges the quality and scalability gap between 3D and 2D generative models. While 2D image synthesis benefits from fixed spatial grids and well-established transformer architectures, 3D generation remains fundamentally more challenging due to the need to predict both spatial structure and detailed geometric surfaces from scratch. These challenges are exacerbated by the computational complexity of existing 3D representations and the lack of structured and scalable 3D asset encoding schemes. To address this, we propose VoxSet, a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid, enabling efficient and position-aware generation. VoxSet retains the simplicity and compression advantages of prior VecSet methods while introducing explicit structure into the latent space, allowing positional embeddings to guide generation and enabling strong token-level test-time scaling. Built upon this representation, LATTICE adopts a two-stage pipeline: first generating a sparse voxelized geometry anchor, then producing detailed geometry using a rectified flow transformer. Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes, achieving state-of-the-art performance on various aspects, and offering a significant step toward scalable, high-quality 3D asset creation.
Related papers
- Native and Compact Structured Latents for 3D Generation [44.84214149555589]
We present an approach for learning a structured latent representation from native 3D data to address this challenge.<n>At its core is a new sparse voxel structure called O-Voxel, an omni-voxel representation that encodes both geometry and appearance.<n>Based on O-Voxel, we design a Sparse Compression VAE which provides a high spatial compression rate and a compact latent space.
arXiv Detail & Related papers (2025-12-16T18:58:28Z) - IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction [82.53307702809606]
Humans naturally perceive the geometric structure and semantic content of a 3D world as intertwined dimensions.<n>We propose InstanceGrounded Geometry Transformer (IGGT) to unify the knowledge for both spatial reconstruction and instance-level contextual understanding.
arXiv Detail & Related papers (2025-10-26T14:57:44Z) - Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes [60.92139345612904]
We present Light-SQ, a novel superquadric-based optimization framework.<n>We propose a block-regrow-fill strategy guided by structure-aware volumetric decomposition.<n>Experiments demonstrate that Light-SQ enables efficient, high-fidelity, and editable shape abstraction with superquadrics.
arXiv Detail & Related papers (2025-09-29T16:18:32Z) - Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention [54.15345846343084]
We propose Ultra3D, an efficient 3D generation framework that significantly accelerates sparse voxel modeling without compromising quality.<n>Part Attention is a geometry-aware localized attention mechanism that restricts attention computation within semantically consistent part regions.<n>Experiments demonstrate that Ultra3D supports high-resolution 3D generation at 1024 resolution and achieves state-of-the-art performance in both visual fidelity and user preference.
arXiv Detail & Related papers (2025-07-23T17:57:16Z) - Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets [90.99212668875971]
Step1X-3D is an open framework addressing challenges such as data scarcity, algorithmic limitations, and ecosystem fragmentation.<n>We present a two-stage 3D-native architecture combining a hybrid VAE-DiT geometry generator with a diffusion-based texture synthesis module.<n> Benchmark results demonstrate state-of-the-art performance that exceeds existing open-source methods.
arXiv Detail & Related papers (2025-05-12T16:56:30Z) - GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space.<n>Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information.<n>The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.