CRAG: Can 3D Generative Models Help 3D Assembly?
- URL: http://arxiv.org/abs/2602.22629v1
- Date: Thu, 26 Feb 2026 05:04:42 GMT
- Title: CRAG: Can 3D Generative Models Help 3D Assembly?
- Authors: Zeyu Jiang, Sihang Li, Siqi Tan, Chenyang Xu, Juexiao Zhang, Julia Galway-Witham, Xue Wang, Scott A. Williams, Radu Iovita, Chen Feng, Jing Zhang,
- Abstract summary: We reformulate 3D assembly as a joint problem of assembly and generation.<n>We propose CRAG, which simultaneously generates plausible complete shapes and predicts poses for input parts.
- Score: 17.435797014749067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing 3D assembly methods treat the problem as pure pose estimation, rearranging observed parts via rigid transformations. In contrast, human assembly naturally couples structural reasoning with holistic shape inference. Inspired by this intuition, we reformulate 3D assembly as a joint problem of assembly and generation. We show that these two processes are mutually reinforcing: assembly provides part-level structural priors for generation, while generation injects holistic shape context that resolves ambiguities in assembly. Unlike prior methods that cannot synthesize missing geometry, we propose CRAG, which simultaneously generates plausible complete shapes and predicts poses for input parts. Extensive experiments demonstrate state-of-the-art performance across in-the-wild objects with diverse geometries, varying part counts, and missing pieces. Our code and models will be released.
Related papers
- RnG: A Unified Transformer for Complete 3D Modeling from Partial Observations [70.83499963694238]
RnG (Reconstruction and Generation) is a novel feed-forward Transformer that unifies reconstruction and generation.<n>It reconstructs visible geometry and generates plausible, coherent unseen geometry and appearance.<n>Our method achieves state-of-the-art performance in both generalizable 3D reconstruction and novel view generation.
arXiv Detail & Related papers (2026-03-01T17:25:32Z) - Joint Geometry-Appearance Human Reconstruction in a Unified Latent Space via Bridge Diffusion [57.09673862519791]
This paper introduces textbfJGA-LBD, a novel framework that unifies the modeling of geometry and appearance into a joint latent representation.<n> Experiments demonstrate that JGA-LBD outperforms current state-of-the-art approaches in terms of both geometry fidelity and appearance quality.
arXiv Detail & Related papers (2026-01-01T12:48:56Z) - Assembler: Scalable 3D Part Assembly via Anchor Point Diffusion [39.08891847512135]
We present Assembler, a scalable and generalizable framework for 3D part assembly.<n>It handles diverse, in-the-wild objects with varying part counts, geometries, and structures.<n>It achieves state-of-the-art performance on PartNet and is the first to demonstrate high-quality assembly for complex, real-world objects.
arXiv Detail & Related papers (2025-06-20T15:25:20Z) - PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation [79.46526296655776]
PRISM is a novel approach for 3D shape generation that integrates categorical diffusion models with Statistical Shape Models (SSM) and Gaussian Mixture Models (GMM)<n>Our method employs compositional SSMs to capture part-level geometric variations and uses GMM to represent part semantics in a continuous space.<n>Our approach significantly outperforms previous methods in both quality and controllability of part-level operations.
arXiv Detail & Related papers (2025-04-06T11:48:08Z) - Detection Based Part-level Articulated Object Reconstruction from Single RGBD Image [52.11275397911693]
We propose an end-to-end trainable, cross-category method for reconstructing multiple man-made articulated objects from a single RGBD image.<n>We depart from previous works that rely on learning instance-level latent space, focusing on man-made articulated objects with predefined part counts.<n>Our method successfully reconstructs variously structured multiple instances that previous works cannot handle, and outperforms prior works in shape reconstruction and kinematics estimation.
arXiv Detail & Related papers (2025-04-04T05:08:04Z) - Jigsaw++: Imagining Complete Shape Priors for Object Reassembly [45.724655922415934]
Jigsaw++ is a novel generative method designed to tackle the multifaceted challenges of reconstructing complete shape for the reassembly problem.<n>It employs the proposed "retargeting" strategy that effectively leverages the output of any existing assembly method to generate complete shape reconstructions.
arXiv Detail & Related papers (2024-10-15T17:45:37Z) - NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation [52.772319840580074]
3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.
Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation.
We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
arXiv Detail & Related papers (2024-03-27T04:09:34Z) - Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly [7.4109730384078025]
Shape assembly aims to reassemble parts (or fragments) into a complete object.
Shape pose disentanglement of part representations is beneficial to geometric shape assembly.
We propose to leverage SE(3) equivariance for such shape pose disentanglement.
arXiv Detail & Related papers (2023-09-13T09:00:45Z) - Discovering 3D Parts from Image Collections [98.16987919686709]
We tackle the problem of 3D part discovery from only 2D image collections.
Instead of relying on manually annotated parts for supervision, we propose a self-supervised approach.
Our key insight is to learn a novel part shape prior that allows each part to fit an object shape faithfully while constrained to have simple geometry.
arXiv Detail & Related papers (2021-07-28T20:29:16Z) - Generative 3D Part Assembly via Dynamic Graph Learning [34.108515032411695]
Part assembly is a challenging yet crucial task in 3D computer vision and robotics.
We propose an assembly-oriented dynamic graph learning framework that leverages an iterative graph neural network as a backbone.
arXiv Detail & Related papers (2020-06-14T04:26:42Z) - Combinatorial 3D Shape Generation via Sequential Assembly [40.2815083025929]
Sequential assembly with geometric primitives has drawn attention in robotics and 3D vision since it yields a practical blueprint to construct a target shape.
We propose a 3D shape generation framework to alleviate this consequence induced by a huge number of feasible combinations.
Experimental results demonstrate that our method successfully generates 3D shapes and simulates more realistic generation processes.
arXiv Detail & Related papers (2020-04-16T01:23:14Z) - Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from
a Single RGB Image [102.44347847154867]
We propose a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives.
Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives.
Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.
arXiv Detail & Related papers (2020-04-02T17:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.