AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
- URL: http://arxiv.org/abs/2502.11124v1
- Date: Sun, 16 Feb 2025 13:45:10 GMT
- Title: AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
- Authors: Yuanfei Wang, Xiaojie Zhang, Ruihai Wu, Yu Li, Yan Shen, Mingdong Wu, Zhaofeng He, Yizhou Wang, Hao Dong,
- Abstract summary: Articulated object manipulation is a critical capability for robots to perform various tasks in real-world scenarios.
Previous datasets and simulation environments for articulated objects have primarily focused on simple manipulation mechanisms.
We build a novel articulated object manipulation environment and equip it with 9 categories of objects.
Based on the environment and objects, we propose an adaptive demonstration collection and 3D visual diffusion-based imitation learning pipeline.
- Score: 25.331956706253614
- License:
- Abstract: Articulated object manipulation is a critical capability for robots to perform various tasks in real-world scenarios. Composed of multiple parts connected by joints, articulated objects are endowed with diverse functional mechanisms through complex relative motions. For example, a safe consists of a door, a handle, and a lock, where the door can only be opened when the latch is unlocked. The internal structure, such as the state of a lock or joint angle constraints, cannot be directly observed from visual observation. Consequently, successful manipulation of these objects requires adaptive adjustment based on trial and error rather than a one-time visual inference. However, previous datasets and simulation environments for articulated objects have primarily focused on simple manipulation mechanisms where the complete manipulation process can be inferred from the object's appearance. To enhance the diversity and complexity of adaptive manipulation mechanisms, we build a novel articulated object manipulation environment and equip it with 9 categories of objects. Based on the environment and objects, we further propose an adaptive demonstration collection and 3D visual diffusion-based imitation learning pipeline that learns the adaptive manipulation policy. The effectiveness of our designs and proposed method is validated through both simulation and real-world experiments. Our project page is available at: https://adamanip.github.io
Related papers
- ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation? [17.356760351203715]
This paper introduces ManipGPT, a framework designed to predict optimal interaction areas for articulated objects.
We created a dataset of 9.9k simulated and real images to bridge the sim-to-real gap.
We significantly improved part-level affordance segmentation, adapting the model's in-context segmentation capabilities to robot manipulation scenarios.
arXiv Detail & Related papers (2024-12-13T11:22:01Z) - Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [59.87033229815062]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered.
Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics.
We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z) - RPMArt: Towards Robust Perception and Manipulation for Articulated Objects [56.73978941406907]
We propose a framework towards Robust Perception and Manipulation for Articulated Objects ( RPMArt)
RPMArt learns to estimate the articulation parameters and manipulate the articulation part from the noisy point cloud.
We introduce an articulation-aware classification scheme to enhance its ability for sim-to-real transfer.
arXiv Detail & Related papers (2024-03-24T05:55:39Z) - Learning Extrinsic Dexterity with Parameterized Manipulation Primitives [8.7221770019454]
We learn a sequence of actions that utilize the environment to change the object's pose.
Our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment.
We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace.
arXiv Detail & Related papers (2023-10-26T21:28:23Z) - GAMMA: Generalizable Articulation Modeling and Manipulation for
Articulated Objects [53.965581080954905]
We propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA)
GAMMA learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories.
Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.
arXiv Detail & Related papers (2023-09-28T08:57:14Z) - One-shot Imitation Learning via Interaction Warping [32.5466340846254]
We propose a new method, Interaction Warping, for learning SE(3) robotic manipulation policies from a single demonstration.
We infer the 3D mesh of each object in the environment using shape warping, a technique for aligning point clouds across object instances.
We show successful one-shot imitation learning on three simulated and real-world object re-arrangement tasks.
arXiv Detail & Related papers (2023-06-21T17:26:11Z) - Learning to Transfer In-Hand Manipulations Using a Greedy Shape
Curriculum [79.6027464700869]
We show that natural and robust in-hand manipulation of simple objects in a dynamic simulation can be learned from a high quality motion capture example.
We propose a simple greedy curriculum search algorithm that can successfully apply to a range of objects such as a teapot, bunny, bottle, train, and elephant.
arXiv Detail & Related papers (2023-03-14T17:08:19Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - A Differentiable Recipe for Learning Visual Non-Prehensile Planar
Manipulation [63.1610540170754]
We focus on the problem of visual non-prehensile planar manipulation.
We propose a novel architecture that combines video decoding neural models with priors from contact mechanics.
We find that our modular and fully differentiable architecture performs better than learning-only methods on unseen objects and motions.
arXiv Detail & Related papers (2021-11-09T18:39:45Z) - Understanding Object Dynamics for Interactive Image-to-Video Synthesis [8.17925295907622]
We present an approach that learns naturally-looking global articulations caused by a local manipulation at a pixel level.
Our generative model learns to infer natural object dynamics as a response to user interaction.
In contrast to existing work on video prediction, we do not synthesize arbitrary realistic videos.
arXiv Detail & Related papers (2021-06-21T17:57:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.