Learning to Rearrange Deformable Cables, Fabrics, and Bags with
Goal-Conditioned Transporter Networks
- URL: http://arxiv.org/abs/2012.03385v4
- Date: Sun, 18 Jun 2023 18:54:13 GMT
- Title: Learning to Rearrange Deformable Cables, Fabrics, and Bags with
Goal-Conditioned Transporter Networks
- Authors: Daniel Seita, Pete Florence, Jonathan Tompson, Erwin Coumans, Vikas
Sindhwani, Ken Goldberg, Andy Zeng
- Abstract summary: Rearranging and manipulating deformable objects such as cables, fabrics, and bags is a long-standing challenge in robotic manipulation.
We develop a suite of simulated benchmarks with 1D, 2D, and 3D deformable structures.
We propose embedding goal-conditioning into Transporter Networks, a recently proposed model architecture for learning robotic manipulation.
- Score: 36.90218756798642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rearranging and manipulating deformable objects such as cables, fabrics, and
bags is a long-standing challenge in robotic manipulation. The complex dynamics
and high-dimensional configuration spaces of deformables, compared to rigid
objects, make manipulation difficult not only for multi-step planning, but even
for goal specification. Goals cannot be as easily specified as rigid object
poses, and may involve complex relative spatial relations such as "place the
item inside the bag". In this work, we develop a suite of simulated benchmarks
with 1D, 2D, and 3D deformable structures, including tasks that involve
image-based goal-conditioning and multi-step deformable manipulation. We
propose embedding goal-conditioning into Transporter Networks, a recently
proposed model architecture for learning robotic manipulation that rearranges
deep features to infer displacements that can represent pick and place actions.
In simulation and in physical experiments, we demonstrate that goal-conditioned
Transporter Networks enable agents to manipulate deformable structures into
flexibly specified configurations without test-time visual anchors for target
locations. We also significantly extend prior results using Transporter
Networks for manipulating deformable objects by testing on tasks with 2D and 3D
deformables. Supplementary material is available at
https://berkeleyautomation.github.io/bags/.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects [13.138509669247508]
Analytic models of elastic, 3D deformable objects require numerous parameters to describe the potentially infinite degrees of freedom present in determining the object's shape.
Previous attempts at performing 3D shape control rely on hand-crafted features to represent the object shape and require training of object-specific control models.
We overcome these issues through the use of our novel DeformerNet neural network architecture, which operates on a partial-view point cloud of the manipulated object and a point cloud of the goal shape.
This shape embedding enables the robot to learn a visual servo controller that computes the desired robot end-effector action to
arXiv Detail & Related papers (2023-05-08T04:08:06Z) - Deep Reinforcement Learning Based on Local GNN for Goal-conditioned
Deformable Object Rearranging [1.807492010338763]
Object rearranging is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a goal configuration.
Previous studies focus on designing an expert system for each specific task by model-based or data-driven approaches.
We design a local GNN (Graph Neural Network) based learning method, which utilizes two representation graphs to encode keypoints detected from images.
Our framework is effective in multiple 1-D (rope, rope ring) and 2-D (cloth) rearranging tasks in simulation and can be easily transferred to a real robot by fine-tuning a keypoint detector
arXiv Detail & Related papers (2023-02-21T05:21:26Z) - Graph-Transporter: A Graph-based Learning Method for Goal-Conditioned
Deformable Object Rearranging Task [1.807492010338763]
We present a novel framework, Graph-Transporter, for goal-conditioned deformable object rearranging tasks.
Our framework adopts an architecture based on Fully Convolutional Network (FCN) to output pixel-wise pick-and-place actions from only visual input.
arXiv Detail & Related papers (2023-02-21T05:21:04Z) - Planning with Spatial-Temporal Abstraction from Point Clouds for
Deformable Object Manipulation [64.00292856805865]
We propose PlAnning with Spatial-Temporal Abstraction (PASTA), which incorporates both spatial abstraction and temporal abstraction.
Our framework maps high-dimension 3D observations into a set of latent vectors and plans over skill sequences on top of the latent set representation.
We show that our method can effectively perform challenging deformable object manipulation tasks in the real world.
arXiv Detail & Related papers (2022-10-27T19:57:04Z) - Learning Visual Shape Control of Novel 3D Deformable Objects from
Partial-View Point Clouds [7.1659268120093635]
Analytic models of elastic, 3D deformable objects require numerous parameters to describe the potentially infinite degrees of freedom present in determining the object's shape.
Previous attempts at performing 3D shape control rely on hand-crafted features to represent the object shape and require training of object-specific control models.
We overcome these issues through the use of our novel DeformerNet neural network architecture, which operates on a partial-view point cloud of the object being manipulated and a point cloud of the goal shape to learn a low-dimensional representation of the object shape.
arXiv Detail & Related papers (2021-10-10T02:34:57Z) - Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ
Rendering from a Single Image [58.69732754597448]
Given a picture of a chair, could we extract the 3-D shape of the chair, animate its plausible articulations and motions, and render in-situ in its original image space?
We devise an automated approach to extract and manipulate articulated objects in single images.
arXiv Detail & Related papers (2021-08-05T16:20:12Z) - Where2Act: From Pixels to Actions for Articulated 3D Objects [54.19638599501286]
We extract highly localized actionable information related to elementary actions such as pushing or pulling for articulated objects with movable parts.
We propose a learning-from-interaction framework with an online data sampling strategy that allows us to train the network in simulation.
Our learned models even transfer to real-world data.
arXiv Detail & Related papers (2021-01-07T18:56:38Z) - A Long Horizon Planning Framework for Manipulating Rigid Pointcloud
Objects [25.428781562909606]
We present a framework for solving long-horizon planning problems involving manipulation of rigid objects.
Our method plans in the space of object subgoals and frees the planner from reasoning about robot-object interaction dynamics.
arXiv Detail & Related papers (2020-11-16T18:59:33Z) - Latent Space Roadmap for Visual Action Planning of Deformable and Rigid
Object Manipulation [74.88956115580388]
Planning is performed in a low-dimensional latent state space that embeds images.
Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them.
arXiv Detail & Related papers (2020-03-19T18:43:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.