Related papers: Generative 3D Part Assembly via Dynamic Graph Learning

Generative 3D Part Assembly via Dynamic Graph Learning

URL: http://arxiv.org/abs/2006.07793v3
Date: Wed, 23 Dec 2020 05:50:35 GMT
Title: Generative 3D Part Assembly via Dynamic Graph Learning
Authors: Jialei Huang, Guanqi Zhan, Qingnan Fan, Kaichun Mo, Lin Shao, Baoquan Chen, Leonidas Guibas, Hao Dong
Abstract summary: Part assembly is a challenging yet crucial task in 3D computer vision and robotics. We propose an assembly-oriented dynamic graph learning framework that leverages an iterative graph neural network as a backbone.
Score: 34.108515032411695
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous part assembly is a challenging yet crucial task in 3D computer vision and robotics. Analogous to buying an IKEA furniture, given a set of 3D parts that can assemble a single shape, an intelligent agent needs to perceive the 3D part geometry, reason to propose pose estimations for the input parts, and finally call robotic planning and control routines for actuation. In this paper, we focus on the pose estimation subproblem from the vision side involving geometric and relational reasoning over the input part geometry. Essentially, the task of generative 3D part assembly is to predict a 6-DoF part pose, including a rigid rotation and translation, for each input part that assembles a single 3D shape as the final output. To tackle this problem, we propose an assembly-oriented dynamic graph learning framework that leverages an iterative graph neural network as a backbone. It explicitly conducts sequential part assembly refinements in a coarse-to-fine manner, exploits a pair of part relation reasoning module and part aggregation module for dynamically adjusting both part features and their relations in the part graph. We conduct extensive experiments and quantitative comparisons to three strong baseline methods, demonstrating the effectiveness of the proposed approach.

Related papers

Assembler: Scalable 3D Part Assembly via Anchor Point Diffusion [39.08891847512135]
We present Assembler, a scalable and generalizable framework for 3D part assembly.<n>It handles diverse, in-the-wild objects with varying part counts, geometries, and structures.<n>It achieves state-of-the-art performance on PartNet and is the first to demonstrate high-quality assembly for complex, real-world objects.
arXiv Detail & Related papers (2025-06-20T15:25:20Z)
IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments [56.85804719947]
We present IAAO, a framework that builds an explicit 3D model for intelligent agents to gain understanding of articulated objects in their environment through interaction. We first build hierarchical features and label fields for each object state using 3D Gaussian Splatting (3DGS) by distilling mask features and view-consistent labels from multi-view images. We then perform object- and part-level queries on the 3D Gaussian primitives to identify static and articulated elements, estimating global transformations and local articulation parameters along with affordances.
arXiv Detail & Related papers (2025-04-09T12:36:48Z)
PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models [63.1432721793683]
We introduce PartGen, a novel approach that generates 3D objects composed of meaningful parts starting from text, an image, or an unstructured 3D object. We evaluate our method on generated and real 3D assets and show that it outperforms segmentation and part-extraction baselines by a large margin.
arXiv Detail & Related papers (2024-12-24T18:59:43Z)
3D Part Segmentation via Geometric Aggregation of 2D Visual Features [57.20161517451834]
Supervised 3D part segmentation models are tailored for a fixed set of objects and parts, limiting their transferability to open-set, real-world scenarios. Recent works have explored vision-language models (VLMs) as a promising alternative, using multi-view rendering and textual prompting to identify object parts. To address these limitations, we propose COPS, a COmprehensive model for Parts that blends semantics extracted from visual concepts and 3D geometry to effectively identify object parts.
arXiv Detail & Related papers (2024-12-05T15:27:58Z)
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models [20.277479473218513]
We introduce a new task: Zero-Shot 3D Reasoning for parts searching and localization for objects. We design a simple baseline method, Reasoning3D, with the capability to understand and execute complex commands. We show that Reasoning3D can effectively localize and highlight parts of 3D objects based on implicit textual queries.
arXiv Detail & Related papers (2024-05-29T17:56:07Z)
Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images [24.10809783713574]
This paper introduces a novel task: translating multi-view images of a structural 3D model into a detailed sequence of assembly instructions. We propose an end-to-end model known as the Neural Assembler.
arXiv Detail & Related papers (2024-04-25T08:53:23Z)
SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR. SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds. We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z)
Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly [7.4109730384078025]
Shape assembly aims to reassemble parts (or fragments) into a complete object. Shape pose disentanglement of part representations is beneficial to geometric shape assembly. We propose to leverage SE(3) equivariance for such shape pose disentanglement.
arXiv Detail & Related papers (2023-09-13T09:00:45Z)
Attention-based Part Assembly for 3D Volumetric Shape Modeling [0.0]
We propose a VoxAttention network architecture for attention-based part assembly. Experimental results show that our method outperforms most state-of-the-art methods for the part relation-aware 3D shape modeling task.
arXiv Detail & Related papers (2023-04-17T16:53:27Z)
Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation. And we propose three task-specific graph neural networks for effective message passing. Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z)
Discovering 3D Parts from Image Collections [98.16987919686709]
We tackle the problem of 3D part discovery from only 2D image collections. Instead of relying on manually annotated parts for supervision, we propose a self-supervised approach. Our key insight is to learn a novel part shape prior that allows each part to fit an object shape faithfully while constrained to have simple geometry.
arXiv Detail & Related papers (2021-07-28T20:29:16Z)
Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks. ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks. We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z)
Interactive Annotation of 3D Object Geometry using 2D Scribbles [84.51514043814066]
In this paper, we propose an interactive framework for annotating 3D object geometry from point cloud data and RGB imagery. Our framework targets naive users without artistic or graphics expertise.
arXiv Detail & Related papers (2020-08-24T21:51:29Z)
Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image [102.44347847154867]
We propose a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives. Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives. Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.
arXiv Detail & Related papers (2020-04-02T17:58:05Z)
Learning 3D Part Assembly from a Single Image [20.175502864488493]
We introduce a novel problem, single-image-guided 3D part assembly, along with a learningbased solution. We study this problem in the setting of furniture assembly from a given complete set of parts and a single image depicting the entire assembled object.
arXiv Detail & Related papers (2020-03-21T21:19:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.