ArchComplete: Autoregressive 3D Architectural Design Generation with Hierarchical Diffusion-Based Upsampling
- URL: http://arxiv.org/abs/2412.17957v2
- Date: Thu, 13 Feb 2025 21:57:44 GMT
- Title: ArchComplete: Autoregressive 3D Architectural Design Generation with Hierarchical Diffusion-Based Upsampling
- Authors: S. Rasoulzadeh, M. Bank, I. Kovacic, K. Schinegger, S. Rutzinger, M. Wimmer,
- Abstract summary: ArchComplete is a two-stage voxel-based 3D generative pipeline consisting of a vector-quantised model.
Key to our pipeline is (i) learning a contextually rich codebook of local patch embeddings, optimised alongside a 2.5D perceptual loss.
ArchComplete autoregressively generates models at the resolution of $643$ and progressively refines them up to $5123$, with voxel sizes as small as $ approx 9textcm$.
- Score: 0.0
- License:
- Abstract: Recent advances in 3D generative models have shown promising results but often fall short in capturing the complexity of architectural geometries and topologies and fine geometric details at high resolutions. To tackle this, we present ArchComplete, a two-stage voxel-based 3D generative pipeline consisting of a vector-quantised model, whose composition is modelled with an autoregressive transformer for generating coarse shapes, followed by a hierarchical upsampling strategy for further enrichment with fine structures and details. Key to our pipeline is (i) learning a contextually rich codebook of local patch embeddings, optimised alongside a 2.5D perceptual loss that captures global spatial correspondence of projections onto three axis-aligned orthogonal planes, and (ii) redefining upsampling as a set of conditional diffusion models learning from a hierarchy of randomly cropped coarse-to-fine local volumetric patches. Trained on our introduced dataset of 3D house models with fully modelled exterior and interior, ArchComplete autoregressively generates models at the resolution of $64^{3}$ and progressively refines them up to $512^{3}$, with voxel sizes as small as $ \approx 9\text{cm}$. ArchComplete solves a variety of tasks, including genetic interpolation and variation, unconditional synthesis, shape and plan-drawing completion, as well as geometric detailisation, while achieving state-of-the-art performance in quality, diversity, and computational efficiency.
Related papers
- RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets [47.81216915952291]
We present RigAnything, a novel autoregressive transformer-based model.
It makes 3D assets rig-ready by probabilistically generating joints, skeleton topologies, and assigning skinning weights in a template-free manner.
RigAnything demonstrates state-of-the-art performance across diverse object types, including humanoids, quadrupeds, marine creatures, insects, and many more.
arXiv Detail & Related papers (2025-02-13T18:59:13Z) - Factorized Implicit Global Convolution for Automotive Computational Fluid Dynamics Prediction [52.32698071488864]
We propose Factorized Implicit Global Convolution (FIGConv), a novel architecture that efficiently solves CFD problems for very large 3D meshes.
FIGConv achieves quadratic complexity $O(N2)$, a significant improvement over existing 3D neural CFD models.
We validate our approach on the industry-standard Ahmed body dataset and the large-scale DrivAerNet dataset.
arXiv Detail & Related papers (2025-02-06T18:57:57Z) - Structured 3D Latents for Scalable and Versatile 3D Generation [28.672494137267837]
We introduce a novel 3D generation method for versatile and high-quality 3D asset creation.
The cornerstone is a unified Structured LATent representation which allows decoding to different output formats.
This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model.
arXiv Detail & Related papers (2024-12-02T13:58:38Z) - GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata [70.9375320609781]
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV)
We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
arXiv Detail & Related papers (2024-06-12T14:56:56Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - DiffComplete: Diffusion-based Generative 3D Shape Completion [114.43353365917015]
We introduce a new diffusion-based approach for shape completion on 3D range scans.
We strike a balance between realism, multi-modality, and high fidelity.
DiffComplete sets a new SOTA performance on two large-scale 3D shape completion benchmarks.
arXiv Detail & Related papers (2023-06-28T16:07:36Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z) - OctField: Hierarchical Implicit Functions for 3D Modeling [18.488778913029805]
We present a learnable hierarchical implicit representation for 3D surfaces, coded OctField, that allows high-precision encoding of intricate surfaces with low memory and computational budget.
We achieve this goal by introducing a hierarchical octree structure to adaptively subdivide the 3D space according to the surface occupancy and the richness of part geometry.
arXiv Detail & Related papers (2021-11-01T16:29:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.