OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
- URL: http://arxiv.org/abs/2507.06165v1
- Date: Tue, 08 Jul 2025 16:46:15 GMT
- Title: OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
- Authors: Yunhan Yang, Yufan Zhou, Yuan-Chen Guo, Zi-Xin Zou, Yukun Huang, Ying-Tian Liu, Hao Xu, Ding Liang, Yan-Pei Cao, Xihui Liu,
- Abstract summary: We introduce OmniPart, a novel framework for part-aware 3D object generation.<n>Our approach supports user-defined part granularity, precise localization, and enables diverse downstream applications.
- Score: 31.767548415448957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The creation of 3D assets with explicit, editable part structures is crucial for advancing interactive applications, yet most generative methods produce only monolithic shapes, limiting their utility. We introduce OmniPart, a novel framework for part-aware 3D object generation designed to achieve high semantic decoupling among components while maintaining robust structural cohesion. OmniPart uniquely decouples this complex task into two synergistic stages: (1) an autoregressive structure planning module generates a controllable, variable-length sequence of 3D part bounding boxes, critically guided by flexible 2D part masks that allow for intuitive control over part decomposition without requiring direct correspondences or semantic labels; and (2) a spatially-conditioned rectified flow model, efficiently adapted from a pre-trained holistic 3D generator, synthesizes all 3D parts simultaneously and consistently within the planned layout. Our approach supports user-defined part granularity, precise localization, and enables diverse downstream applications. Extensive experiments demonstrate that OmniPart achieves state-of-the-art performance, paving the way for more interpretable, editable, and versatile 3D content.
Related papers
- From One to More: Contextual Part Latents for 3D Generation [33.43336981984443]
CoPart is a part-aware diffusion framework that decomposes 3D objects into contextual part latents for coherent multi-part generation.<n>We construct a novel 3D part dataset derived from articulated mesh segmentation and human-verified annotations.<n>Experiments demonstrate CoPart's superior capabilities in part-level editing, object generation, and scene composition with unprecedented controllability.
arXiv Detail & Related papers (2025-07-11T17:33:18Z) - Efficient Part-level 3D Object Generation via Dual Volume Packing [51.712083185732894]
We propose a new end-to-end framework for part-level 3D object generation.<n>Our method generates high-quality 3D objects with an arbitrary number of complete and semantically meaningful parts.
arXiv Detail & Related papers (2025-06-11T17:55:03Z) - PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers [29.52313100024294]
We introduce PartCrafter, the first structured 3D generative model that jointly synthesizes multiple semantically meaningful and geometrically distinct 3D meshes from a single RGB image.<n>PartCrafter simultaneously denoises multiple 3D parts, enabling end-to-end part-aware generation of both individual objects and complex multi-object scenes.<n> Experiments show that PartCrafter outperforms existing approaches in generating decomposable 3D meshes.
arXiv Detail & Related papers (2025-06-05T20:30:28Z) - HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation [50.206100327643284]
HiScene is a novel hierarchical framework that bridges the gap between 2D image generation and 3D object generation.<n>We generate 3D content that aligns with 2D representations while maintaining compositional structure.
arXiv Detail & Related papers (2025-04-17T16:33:39Z) - Structurally Disentangled Feature Fields Distillation for 3D Understanding and Editing [14.7298711927857]
We propose to capture 3D features using multiple disentangled feature fields.<n>Each element can be controlled in isolation, enabling semantic and structural understanding and editing capabilities.
arXiv Detail & Related papers (2025-02-20T18:09:27Z) - PartSDF: Part-Based Implicit Neural Representation for Composite 3D Shape Parametrization and Optimization [38.822156749041206]
PartSDF is a supervised implicit representation framework that explicitly models composite shapes with independent, controllable parts.<n>PartSDF outperforms both supervised and unsupervised baselines in reconstruction and generation tasks.
arXiv Detail & Related papers (2025-02-18T16:08:47Z) - Chirpy3D: Creative Fine-grained 3D Object Fabrication via Part Sampling [128.23917788822948]
Chirpy3D is a novel approach for fine-grained 3D object generation in a zero-shot setting.<n>The model must infer plausible 3D structures, capture fine-grained details, and generalize to novel objects.<n>Our experiments demonstrate that Chirpy3D surpasses existing methods in generating creative 3D objects with higher quality and fine-grained details.
arXiv Detail & Related papers (2025-01-07T21:14:11Z) - PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models [63.1432721793683]
We introduce PartGen, a novel approach that generates 3D objects composed of meaningful parts starting from text, an image, or an unstructured 3D object.<n>We evaluate our method on generated and real 3D assets and show that it outperforms segmentation and part-extraction baselines by a large margin.
arXiv Detail & Related papers (2024-12-24T18:59:43Z) - 3D Part Segmentation via Geometric Aggregation of 2D Visual Features [57.20161517451834]
Supervised 3D part segmentation models are tailored for a fixed set of objects and parts, limiting their transferability to open-set, real-world scenarios.<n>Recent works have explored vision-language models (VLMs) as a promising alternative, using multi-view rendering and textual prompting to identify object parts.<n>To address these limitations, we propose COPS, a COmprehensive model for Parts that blends semantics extracted from visual concepts and 3D geometry to effectively identify object parts.
arXiv Detail & Related papers (2024-12-05T15:27:58Z) - Part123: Part-aware 3D Reconstruction from a Single-view Image [54.589723979757515]
Part123 is a novel framework for part-aware 3D reconstruction from a single-view image.
We introduce contrastive learning into a neural rendering framework to learn a part-aware feature space.
A clustering-based algorithm is also developed to automatically derive 3D part segmentation results from the reconstructed models.
arXiv Detail & Related papers (2024-05-27T07:10:21Z) - 3DLatNav: Navigating Generative Latent Spaces for Semantic-Aware 3D
Object Manipulation [2.8661021832561757]
3D generative models have been recently successful in generating realistic 3D objects in the form of point clouds.
Most models do not offer controllability to manipulate the shape semantics of component object parts without extensive semantic labels or other reference point clouds.
We propose 3DLatNav; a novel approach to navigating pretrained generative latent spaces to enable controlled part-level semantic manipulation of 3D objects.
arXiv Detail & Related papers (2022-11-17T18:47:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.