NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
- URL: http://arxiv.org/abs/2403.18241v2
- Date: Fri, 12 Jul 2024 07:30:00 GMT
- Title: NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
- Authors: Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan Ji,
- Abstract summary: 3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.
Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation.
We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
- Score: 52.772319840580074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints. Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation without considering spatial consistency. As a result, these approaches exhibit limited versatility in 3D data representation and shape generation, hindering their ability to generate highly diverse 3D shapes that comply with the specified constraints. In this paper, we introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling. To ensure spatial coherence and reduce memory usage, we incorporate a hybrid shape representation technique that directly learns a continuous signed distance field representation of the 3D shape using orthogonal 2D planes. Additionally, we meticulously enforce spatial correspondences across distinct planes using a transformer-based autoencoder structure, promoting the preservation of spatial relationships in the generated 3D shapes. This yields an algorithm that consistently outperforms state-of-the-art 3D shape generation methods on various tasks, including unconditional shape generation, multi-modal shape completion, single-view reconstruction, and text-to-shape synthesis. Our project page is available at https://weizheliu.github.io/NeuSDFusion/ .
Related papers
- DetailGen3D: Generative 3D Geometry Enhancement via Data-Dependent Flow [44.72037991063735]
DetailGen3D is a generative approach specifically designed to enhance generated 3D shapes.
Our key insight is to model the coarse-to-fine transformation directly through data-dependent flows in latent space.
We introduce a token matching strategy that ensures accurate spatial correspondence during refinement.
arXiv Detail & Related papers (2024-11-25T17:08:17Z) - GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space.
Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information.
The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single/multi-view image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z) - 3D Semantic Subspace Traverser: Empowering 3D Generative Model with
Shape Editing Capability [13.041974495083197]
Previous studies on 3D shape generation have focused on shape quality and structure, without or less considering the importance of semantic information.
We propose a novel semantic generative model named 3D Semantic Subspace Traverser.
Our method can produce plausible shapes with complex structures and enable the editing of semantic attributes.
arXiv Detail & Related papers (2023-07-26T09:04:27Z) - Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text
Aligned Latent Representation [47.945556996219295]
We present a novel alignment-before-generation approach to generate 3D shapes based on 2D images or texts.
Our framework comprises two models: a Shape-Image-Text-Aligned Variational Auto-Encoder (SITA-VAE) and a conditional Aligned Shape Latent Diffusion Model (ASLDM)
arXiv Detail & Related papers (2023-06-29T17:17:57Z) - IC3D: Image-Conditioned 3D Diffusion for Shape Generation [4.470499157873342]
Denoising Diffusion Probabilistic Models (DDPMs) have demonstrated exceptional performance in various 2D generative tasks.
We introduce CISP (Contrastive Image-Shape Pre-training), obtaining a well-structured image-shape joint embedding space.
We then introduce IC3D, a DDPM that harnesses CISP's guidance for 3D shape generation from single-view images.
arXiv Detail & Related papers (2022-11-20T04:21:42Z) - XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z) - Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D
Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels.
Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.