RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects
with Graph Networks
- URL: http://arxiv.org/abs/2205.02909v1
- Date: Thu, 5 May 2022 20:28:15 GMT
- Title: RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects
with Graph Networks
- Authors: Haochen Shi, Huazhe Xu, Zhiao Huang, Yunzhu Li, Jiajun Wu
- Abstract summary: We present a model-based planning framework for modeling and manipulating elasto-plastic objects.
Our system, RoboCraft, learns a particle-based dynamics model using graph neural networks (GNNs) to capture the structure of the underlying system.
We show through experiments that with just 10 minutes of real-world robotic interaction data, our robot can learn a dynamics model that can be used to synthesize control signals to deform elasto-plastic objects into various target shapes.
- Score: 32.00371492516123
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modeling and manipulating elasto-plastic objects are essential capabilities
for robots to perform complex industrial and household interaction tasks (e.g.,
stuffing dumplings, rolling sushi, and making pottery). However, due to the
high degree of freedom of elasto-plastic objects, significant challenges exist
in virtually every aspect of the robotic manipulation pipeline, e.g.,
representing the states, modeling the dynamics, and synthesizing the control
signals. We propose to tackle these challenges by employing a particle-based
representation for elasto-plastic objects in a model-based planning framework.
Our system, RoboCraft, only assumes access to raw RGBD visual observations. It
transforms the sensing data into particles and learns a particle-based dynamics
model using graph neural networks (GNNs) to capture the structure of the
underlying system. The learned model can then be coupled with model-predictive
control (MPC) algorithms to plan the robot's behavior. We show through
experiments that with just 10 minutes of real-world robotic interaction data,
our robot can learn a dynamics model that can be used to synthesize control
signals to deform elasto-plastic objects into various target shapes, including
shapes that the robot has never encountered before. We perform systematic
evaluations in both simulation and the real world to demonstrate the robot's
manipulation capabilities and ability to generalize to a more complex action
space, different tool shapes, and a mixture of motion modes. We also conduct
comparisons between RoboCraft and untrained human subjects controlling the
gripper to manipulate deformable objects in both simulation and the real world.
Our learned model-based planning framework is comparable to and sometimes
better than human subjects on the tested tasks.
Related papers
- Differentiable Robot Rendering [45.23538293501457]
We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to its control parameters.
We demonstrate its capability and usage in applications including reconstruction of robot poses from images and controlling robots through vision language models.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing [38.97168020979433]
We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model.
Our proposed framework, RoboPack, employs a recurrent graph neural network to estimate object states.
We demonstrate our approach on a real robot equipped with a compliant Soft-Bubble tactile sensor on non-prehensile manipulation and dense packing tasks.
arXiv Detail & Related papers (2024-07-01T16:08:37Z) - ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots [24.035706461949715]
There is a pressing need to develop a model that enables general-purpose robots to undertake a broad spectrum of manipulation tasks.
Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation.
Our model achieves average success rates of around 90%.
arXiv Detail & Related papers (2024-05-11T09:18:37Z) - DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative
Diffusion Models [102.13968267347553]
We present DiffuseBot, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks.
We showcase a range of simulated and fabricated robots along with their capabilities.
arXiv Detail & Related papers (2023-11-28T18:58:48Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - RT-1: Robotics Transformer for Real-World Control at Scale [98.09428483862165]
We present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties.
We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks.
arXiv Detail & Related papers (2022-12-13T18:55:15Z) - Factored World Models for Zero-Shot Generalization in Robotic
Manipulation [7.258229016768018]
We learn to generalize over robotic pick-and-place tasks using object-factored world models.
We use a residual stack of graph neural networks that receive action information at multiple levels in both their node and edge neural networks.
We show that an ensemble of our models can be used to plan for tasks involving up to 12 pick and place actions using search.
arXiv Detail & Related papers (2022-02-10T21:26:11Z) - Full-Body Visual Self-Modeling of Robot Morphologies [29.76701883250049]
Internal computational models of physical bodies are fundamental to the ability of robots and animals alike to plan and control their actions.
Recent progress in fully data-driven self-modeling has enabled machines to learn their own forward kinematics directly from task-agnostic interaction data.
Here, we propose that instead of directly modeling forward-kinematics, a more useful form of self-modeling is one that could answer space occupancy queries.
arXiv Detail & Related papers (2021-11-11T18:58:07Z) - V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated
Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects.
Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.