3D Shape Tokenization
- URL: http://arxiv.org/abs/2412.15618v2
- Date: Tue, 24 Dec 2024 14:07:12 GMT
- Title: 3D Shape Tokenization
- Authors: Jen-Hao Rick Chang, Yuyang Wang, Miguel Angel Bautista Martin, Jiatao Gu, Josh Susskind, Oncel Tuzel,
- Abstract summary: We introduce Shape Tokens, a 3D representation that is continuous, compact, and easy to incorporate into machine learning models.
Shape Tokens act as conditioning vectors that represent shape information in a 3D flow-matching model.
By attaching Shape Tokens to various machine learning models, we can generate new shapes, convert images to 3D, align 3D shapes with text and images, and render shapes directly at variable, user specified, resolution.
- Score: 38.408642959154925
- License:
- Abstract: We introduce Shape Tokens, a 3D representation that is continuous, compact, and easy to incorporate into machine learning models. Shape Tokens act as conditioning vectors that represent shape information in a 3D flow-matching model. The flow-matching model is trained to approximate probability density functions corresponding to delta functions concentrated on the surfaces of shapes in 3D. By attaching Shape Tokens to various machine learning models, we can generate new shapes, convert images to 3D, align 3D shapes with text and images, and render shapes directly at variable, user specified, resolution. Moreover, Shape Tokens enable a systematic analysis of geometric properties such as normal, density, and deformation field. Across all tasks and experiments, utilizing Shape Tokens demonstrate strong performance compared to existing baselines.
Related papers
- NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation [52.772319840580074]
3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.
Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation.
We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
arXiv Detail & Related papers (2024-03-27T04:09:34Z) - Explorable Mesh Deformation Subspaces from Unstructured Generative
Models [53.23510438769862]
Deep generative models of 3D shapes often feature continuous latent spaces that can be used to explore potential variations.
We present a method to explore variations among a given set of landmark shapes by constructing a mapping from an easily-navigable 2D exploration space to a subspace of a pre-trained generative model.
arXiv Detail & Related papers (2023-10-11T18:53:57Z) - Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text
Aligned Latent Representation [47.945556996219295]
We present a novel alignment-before-generation approach to generate 3D shapes based on 2D images or texts.
Our framework comprises two models: a Shape-Image-Text-Aligned Variational Auto-Encoder (SITA-VAE) and a conditional Aligned Shape Latent Diffusion Model (ASLDM)
arXiv Detail & Related papers (2023-06-29T17:17:57Z) - 3D VR Sketch Guided 3D Shape Prototyping and Exploration [108.6809158245037]
We propose a 3D shape generation network that takes a 3D VR sketch as a condition.
We assume that sketches are created by novices without art training.
Our method creates multiple 3D shapes that align with the original sketch's structure.
arXiv Detail & Related papers (2023-06-19T10:27:24Z) - Learning to Generate 3D Shapes from a Single Example [28.707149807472685]
We present a multi-scale GAN-based model designed to capture the input shape's geometric features across a range of spatial scales.
We train our generative model on a voxel pyramid of the reference shape, without the need of any external supervision or manual annotation.
The resulting shapes present variations across different scales, and at the same time retain the global structure of the reference shape.
arXiv Detail & Related papers (2022-08-05T01:05:32Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - ShapeFormer: Transformer-based Shape Completion via Sparse
Representation [41.33457875133559]
We present ShapeFormer, a network that produces a distribution of object completions conditioned on incomplete, and possibly noisy, point clouds.
The resultant distribution can then be sampled to generate likely completions, each exhibiting plausible shape details while being faithful to the input.
arXiv Detail & Related papers (2022-01-25T13:58:30Z) - Deformed Implicit Field: Modeling 3D Shapes with Learned Dense
Correspondence [30.849927968528238]
We propose a novel Deformed Implicit Field representation for modeling 3D shapes of a category.
Our neural network, dubbed DIF-Net, jointly learns a shape latent space and these fields for 3D objects belonging to a category.
Experiments show that DIF-Net not only produces high-fidelity 3D shapes but also builds high-quality dense correspondences across different shapes.
arXiv Detail & Related papers (2020-11-27T10:45:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.