Graph-Transporter: A Graph-based Learning Method for Goal-Conditioned
Deformable Object Rearranging Task
- URL: http://arxiv.org/abs/2302.10445v1
- Date: Tue, 21 Feb 2023 05:21:04 GMT
- Title: Graph-Transporter: A Graph-based Learning Method for Goal-Conditioned
Deformable Object Rearranging Task
- Authors: Yuhong Deng, Chongkun Xia, Xueqian Wang and Lipeng Chen
- Abstract summary: We present a novel framework, Graph-Transporter, for goal-conditioned deformable object rearranging tasks.
Our framework adopts an architecture based on Fully Convolutional Network (FCN) to output pixel-wise pick-and-place actions from only visual input.
- Score: 1.807492010338763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rearranging deformable objects is a long-standing challenge in robotic
manipulation for the high dimensionality of configuration space and the complex
dynamics of deformable objects. We present a novel framework,
Graph-Transporter, for goal-conditioned deformable object rearranging tasks. To
tackle the challenge of complex configuration space and dynamics, we represent
the configuration space of a deformable object with a graph structure and the
graph features are encoded by a graph convolution network. Our framework adopts
an architecture based on Fully Convolutional Network (FCN) to output pixel-wise
pick-and-place actions from only visual input. Extensive experiments have been
conducted to validate the effectiveness of the graph representation of
deformable object configuration. The experimental results also demonstrate that
our framework is effective and general in handling goal-conditioned deformable
object rearranging tasks.
Related papers
- SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects [20.978091381109294]
We propose a method to generate articulated objects from a single image.
Our method generates an articulated object that is visually consistent with the input image.
Our experiments show that our method outperforms the state-of-the-art in articulated object creation.
arXiv Detail & Related papers (2024-10-21T20:41:32Z) - Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Graphical Object-Centric Actor-Critic [55.2480439325792]
We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches.
We use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment.
Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm.
arXiv Detail & Related papers (2023-10-26T06:05:12Z) - Learning visual-based deformable object rearrangement with local graph
neural networks [4.333220038316982]
We propose a novel representation strategy that can efficiently model the deformable object states with a set of keypoints and their interactions.
We also propose a light local GNN learning to jointly model the deformable rearrangement dynamics and infer the optimal manipulation actions.
Our method reaches much higher success rates on a variety of deformable rearrangement tasks (96.3% on average) than state-of-the-art method in simulation experiments.
arXiv Detail & Related papers (2023-10-16T11:42:54Z) - Deep Reinforcement Learning Based on Local GNN for Goal-conditioned
Deformable Object Rearranging [1.807492010338763]
Object rearranging is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a goal configuration.
Previous studies focus on designing an expert system for each specific task by model-based or data-driven approaches.
We design a local GNN (Graph Neural Network) based learning method, which utilizes two representation graphs to encode keypoints detected from images.
Our framework is effective in multiple 1-D (rope, rope ring) and 2-D (cloth) rearranging tasks in simulation and can be easily transferred to a real robot by fine-tuning a keypoint detector
arXiv Detail & Related papers (2023-02-21T05:21:26Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Learning to Rearrange Deformable Cables, Fabrics, and Bags with
Goal-Conditioned Transporter Networks [36.90218756798642]
Rearranging and manipulating deformable objects such as cables, fabrics, and bags is a long-standing challenge in robotic manipulation.
We develop a suite of simulated benchmarks with 1D, 2D, and 3D deformable structures.
We propose embedding goal-conditioning into Transporter Networks, a recently proposed model architecture for learning robotic manipulation.
arXiv Detail & Related papers (2020-12-06T22:21:54Z) - Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [64.14028598360741]
In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module.
The image synthesis network is designed to efficiently span the pose configuration space.
We experimentally show that the method can recover orientation of objects with high accuracy from 2D images alone.
arXiv Detail & Related papers (2020-08-18T20:30:47Z) - Latent Space Roadmap for Visual Action Planning of Deformable and Rigid
Object Manipulation [74.88956115580388]
Planning is performed in a low-dimensional latent state space that embeds images.
Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them.
arXiv Detail & Related papers (2020-03-19T18:43:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.