Related papers: RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks

RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks

URL: http://arxiv.org/abs/2205.02909v1
Date: Thu, 5 May 2022 20:28:15 GMT
Title: RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks
Authors: Haochen Shi, Huazhe Xu, Zhiao Huang, Yunzhu Li, Jiajun Wu
Abstract summary: We present a model-based planning framework for modeling and manipulating elasto-plastic objects. Our system, RoboCraft, learns a particle-based dynamics model using graph neural networks (GNNs) to capture the structure of the underlying system. We show through experiments that with just 10 minutes of real-world robotic interaction data, our robot can learn a dynamics model that can be used to synthesize control signals to deform elasto-plastic objects into various target shapes.
Score: 32.00371492516123
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modeling and manipulating elasto-plastic objects are essential capabilities for robots to perform complex industrial and household interaction tasks (e.g., stuffing dumplings, rolling sushi, and making pottery). However, due to the high degree of freedom of elasto-plastic objects, significant challenges exist in virtually every aspect of the robotic manipulation pipeline, e.g., representing the states, modeling the dynamics, and synthesizing the control signals. We propose to tackle these challenges by employing a particle-based representation for elasto-plastic objects in a model-based planning framework. Our system, RoboCraft, only assumes access to raw RGBD visual observations. It transforms the sensing data into particles and learns a particle-based dynamics model using graph neural networks (GNNs) to capture the structure of the underlying system. The learned model can then be coupled with model-predictive control (MPC) algorithms to plan the robot's behavior. We show through experiments that with just 10 minutes of real-world robotic interaction data, our robot can learn a dynamics model that can be used to synthesize control signals to deform elasto-plastic objects into various target shapes, including shapes that the robot has never encountered before. We perform systematic evaluations in both simulation and the real world to demonstrate the robot's manipulation capabilities and ability to generalize to a more complex action space, different tool shapes, and a mixture of motion modes. We also conduct comparisons between RoboCraft and untrained human subjects controlling the gripper to manipulate deformable objects in both simulation and the real world. Our learned model-based planning framework is comparable to and sometimes better than human subjects on the tested tasks.

Related papers

RoboPearls: Editable Video Simulation for Robot Manipulation [81.18434338506621]
RoboPearls is an editable video simulation framework for robotic manipulation.<n>Built on 3D Gaussian Splatting (3DGS), RoboPearls enables the construction of photo-realistic, view-consistent simulations.<n>We conduct extensive experiments on multiple datasets and scenes, including RLBench, COLOSSEUM, Ego4D, Open X-Embodiment, and a real-world robot.
arXiv Detail & Related papers (2025-06-28T05:03:31Z)
Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination [25.62602420895531]
DreMa is a new approach for constructing digital twins using learned explicit representations of the real world and its dynamics. We show that DreMa can successfully learn novel physical tasks from just a single example per task variation.
arXiv Detail & Related papers (2024-12-19T15:38:15Z)
Differentiable Robot Rendering [45.23538293501457]
We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to its control parameters. We demonstrate its capability and usage in applications including reconstruction of robot poses from images and controlling robots through vision language models.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris. Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models. We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z)
RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing [38.97168020979433]
We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model. Our proposed framework, RoboPack, employs a recurrent graph neural network to estimate object states. We demonstrate our approach on a real robot equipped with a compliant Soft-Bubble tactile sensor on non-prehensile manipulation and dense packing tasks.
arXiv Detail & Related papers (2024-07-01T16:08:37Z)
ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots [24.035706461949715]
There is a pressing need to develop a model that enables general-purpose robots to undertake a broad spectrum of manipulation tasks. Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation. Our model achieves average success rates of around 90%.
arXiv Detail & Related papers (2024-05-11T09:18:37Z)
DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models [102.13968267347553]
We present DiffuseBot, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks. We showcase a range of simulated and fabricated robots along with their capabilities.
arXiv Detail & Related papers (2023-11-28T18:58:48Z)
Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics. Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens. We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z)
RT-1: Robotics Transformer for Real-World Control at Scale [98.09428483862165]
We present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks.
arXiv Detail & Related papers (2022-12-13T18:55:15Z)
Factored World Models for Zero-Shot Generalization in Robotic Manipulation [7.258229016768018]
We learn to generalize over robotic pick-and-place tasks using object-factored world models. We use a residual stack of graph neural networks that receive action information at multiple levels in both their node and edge neural networks. We show that an ensemble of our models can be used to plan for tasks involving up to 12 pick and place actions using search.
arXiv Detail & Related papers (2022-02-10T21:26:11Z)
Full-Body Visual Self-Modeling of Robot Morphologies [29.76701883250049]
Internal computational models of physical bodies are fundamental to the ability of robots and animals alike to plan and control their actions. Recent progress in fully data-driven self-modeling has enabled machines to learn their own forward kinematics directly from task-agnostic interaction data. Here, we propose that instead of directly modeling forward-kinematics, a more useful form of self-modeling is one that could answer space occupancy queries.
arXiv Detail & Related papers (2021-11-11T18:58:07Z)
V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated Objects [51.79035249464852]
We present a framework for learning multi-arm manipulation of articulated objects. Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm.
arXiv Detail & Related papers (2021-11-07T02:31:09Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.