Related papers: CAGE: Controllable Articulation GEneration

CAGE: Controllable Articulation GEneration

URL: http://arxiv.org/abs/2312.09570v2
Date: Wed, 20 Mar 2024 04:05:37 GMT
Title: CAGE: Controllable Articulation GEneration
Authors: Jiayi Liu, Hou In Ivan Tam, Ali Mahdavi-Amiri, Manolis Savva,
Abstract summary: We leverage the interplay between part shape, connectivity, and motion using a denoising diffusion-based method. Our method takes an object category label and a part connectivity graph as input and generates an object's geometry and motion parameters. Our experiments show that our method outperforms the state-of-the-art in articulated object generation.
Score: 14.002289666443529
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the challenge of generating 3D articulated objects in a controllable fashion. Currently, modeling articulated 3D objects is either achieved through laborious manual authoring, or using methods from prior work that are hard to scale and control directly. We leverage the interplay between part shape, connectivity, and motion using a denoising diffusion-based method with attention modules designed to extract correlations between part attributes. Our method takes an object category label and a part connectivity graph as input and generates an object's geometry and motion parameters. The generated objects conform to user-specified constraints on the object category, part shape, and part articulation. Our experiments show that our method outperforms the state-of-the-art in articulated object generation, producing more realistic objects while conforming better to user constraints. Video Summary at: http://youtu.be/cH_rbKbyTpE

Related papers

Articulated 3D Scene Graphs for Open-World Mobile Manipulation [55.97942733699124]
We present MoMa-SG, a framework for building semantic-kinematic 3D scene graphs of articulated scenes.<n>We estimate articulation models using a novel unified twist estimation formulation.<n>We also introduce the novel Arti4D-Semantic dataset.
arXiv Detail & Related papers (2026-02-18T10:40:35Z)
IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments [56.85804719947]
We present IAAO, a framework that builds an explicit 3D model for intelligent agents to gain understanding of articulated objects in their environment through interaction. We first build hierarchical features and label fields for each object state using 3D Gaussian Splatting (3DGS) by distilling mask features and view-consistent labels from multi-view images. We then perform object- and part-level queries on the 3D Gaussian primitives to identify static and articulated elements, estimating global transformations and local articulation parameters along with affordances.
arXiv Detail & Related papers (2025-04-09T12:36:48Z)
C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation [81.4106601222722]
Trajectory-based motion control has emerged as an intuitive and efficient approach for controllable video generation. We propose a Chain-of-Thought-based motion controller for controllable video generation, named C-Drag. Our method includes an object perception module and a Chain-of-Thought-based motion reasoning module.
arXiv Detail & Related papers (2025-02-27T08:21:03Z)
Articulate That Object Part (ATOP): 3D Part Articulation via Text and Motion Personalization [9.231848716070257]
ATOP (Articulate That Object Part) is a novel few-shot method based on motion personalization to articulate a static 3D object. We show that our method is capable of generating realistic motion videos and predicting 3D motion parameters in a more accurate and generalizable way.
arXiv Detail & Related papers (2025-02-11T05:47:16Z)
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects [20.978091381109294]
We propose a method to generate articulated objects from a single image. Our method generates an articulated object that is visually consistent with the input image. Our experiments show that our method outperforms the state-of-the-art in articulated object creation.
arXiv Detail & Related papers (2024-10-21T20:41:32Z)
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [59.87033229815062]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered. Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics. We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z)
Controllable Human-Object Interaction Synthesis [77.56877961681462]
We propose Controllable Human-Object Interaction Synthesis (CHOIS) to generate synchronized object motion and human motion in 3D scenes. Here, language descriptions inform style and intent, and waypoints, which can be effectively extracted from high-level planning, ground the motion in the scene. Our module seamlessly integrates with a path planning module, enabling the generation of long-term interactions in 3D environments.
arXiv Detail & Related papers (2023-12-06T21:14:20Z)
GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects [53.965581080954905]
We propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA) GAMMA learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories. Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.
arXiv Detail & Related papers (2023-09-28T08:57:14Z)
ROAM: Robust and Object-Aware Motion Generation Using Neural Pose Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object. We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object. We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z)
Learn the Force We Can: Enabling Sparse Motion Control in Multi-Object Video Generation [26.292052071093945]
We propose an unsupervised method to generate videos from a single frame and a sparse motion input. Our trained model can generate unseen realistic object-to-object interactions. We show that YODA is on par with or better than state of the art video generation prior work in terms of both controllability and video quality.
arXiv Detail & Related papers (2023-06-06T19:50:02Z)
Unsupervised Kinematic Motion Detection for Part-segmented 3D Shape Collections [14.899075941080541]
We present an unsupervised approach for discovering articulated motions in a part-segmented 3D shape collection. Our approach is based on a concept we call category closure: any valid articulation of an object's parts should keep the object in the same semantic category. We evaluate our approach by using it to re-discover part motions from the PartNet-Mobility dataset.
arXiv Detail & Related papers (2022-06-17T00:50:36Z)
DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole. Our method first partitions the motion field by minimizing the mutual information between segments. It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z)
RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces [77.07767833443256]
We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects. In contrast to state-of-the-art methods in object-centric generative modeling, RELATE also extends naturally to dynamic scenes and generates videos of high visual fidelity.
arXiv Detail & Related papers (2020-07-02T17:27:27Z)
A Deep Learning Approach to Object Affordance Segmentation [31.221897360610114]
We design an autoencoder that infers pixel-wise affordance labels in both videos and static images. Our model surpasses the need for object labels and bounding boxes by using a soft-attention mechanism. We show that our model achieves competitive results compared to strongly supervised methods on SOR3D-AFF.
arXiv Detail & Related papers (2020-04-18T15:34:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.