Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically
Structured Sequences
- URL: http://arxiv.org/abs/2111.12480v1
- Date: Wed, 24 Nov 2021 13:17:16 GMT
- Title: Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically
Structured Sequences
- Authors: Moritz Ibing, Gregor Kobsik, Leif Kobbelt
- Abstract summary: Autoregressive models have proven to be very powerful in NLP text generation tasks.
We introduce an adaptive compression scheme, that significantly reduces sequence lengths.
We demonstrate the performance of our model by comparing against the state-of-the-art in shape generation.
- Score: 11.09257948735229
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autoregressive models have proven to be very powerful in NLP text generation
tasks and lately have gained popularity for image generation as well. However,
they have seen limited use for the synthesis of 3D shapes so far. This is
mainly due to the lack of a straightforward way to linearize 3D data as well as
to scaling problems with the length of the resulting sequences when describing
complex shapes. In this work we address both of these problems. We use octrees
as a compact hierarchical shape representation that can be sequentialized by
traversal ordering. Moreover, we introduce an adaptive compression scheme, that
significantly reduces sequence lengths and thus enables their effective
generation with a transformer, while still allowing fully autoregressive
sampling and parallel training. We demonstrate the performance of our model by
comparing against the state-of-the-art in shape generation.
Related papers
- G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer [4.221298212125194]
We introduce G3PT, a scalable coarse-to-fine 3D generative model utilizing a cross-scale querying transformer.
Cross-scale querying transformer connects tokens globally across different levels of detail without requiring an ordered sequence.
Experiments demonstrate that G3PT achieves superior generation quality and generalization ability compared to previous 3D generation methods.
arXiv Detail & Related papers (2024-09-10T08:27:19Z) - VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation [69.68568248073747]
We propose Pose-dependent Consistency Distillation Sampling (PCDS), a novel yet efficient objective for diffusion-based 3D generation tasks.
PCDS builds the pose-dependent consistency function within diffusion trajectories, allowing to approximate true gradients through minimal sampling steps.
For efficient generation, we propose a coarse-to-fine optimization strategy, which first utilizes 1-step PCDS to create the basic structure of 3D objects, and then gradually increases PCDS steps to generate fine-grained details.
arXiv Detail & Related papers (2024-06-21T08:21:52Z) - MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation [52.772319840580074]
3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.
Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation.
We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
arXiv Detail & Related papers (2024-03-27T04:09:34Z) - Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D
Reconstruction with Transformers [37.14235383028582]
We introduce a novel approach for single-view reconstruction that efficiently generates a 3D model from a single image via feed-forward inference.
Our method utilizes two transformer-based networks, namely a point decoder and a triplane decoder, to reconstruct 3D objects using a hybrid Triplane-Gaussian intermediate representation.
arXiv Detail & Related papers (2023-12-14T17:18:34Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z) - Autoregressive 3D Shape Generation via Canonical Mapping [92.91282602339398]
transformers have shown remarkable performances in a variety of generative tasks such as image, audio, and text generation.
In this paper, we aim to further exploit the power of transformers and employ them for the task of 3D point cloud generation.
Our model can be easily extended to multi-modal shape completion as an application for conditional shape generation.
arXiv Detail & Related papers (2022-04-05T03:12:29Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.