End-to-end Reinforcement Learning of Robotic Manipulation with Robust
Keypoints Representation
- URL: http://arxiv.org/abs/2202.06027v1
- Date: Sat, 12 Feb 2022 09:58:09 GMT
- Title: End-to-end Reinforcement Learning of Robotic Manipulation with Robust
Keypoints Representation
- Authors: Tianying Wang, En Yen Puang, Marcus Lee, Yan Wu, Wei Jing
- Abstract summary: We present an end-to-end Reinforcement Learning framework for robotic manipulation tasks, using a robust and efficient keypoints representation.
The proposed method learns keypoints from camera images as the state representation, through a self-supervised autoencoder architecture.
We demonstrate the effectiveness of the proposed method on robotic manipulation tasks including grasping and pushing, in different scenarios.
- Score: 7.374994747693731
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an end-to-end Reinforcement Learning(RL) framework for robotic
manipulation tasks, using a robust and efficient keypoints representation. The
proposed method learns keypoints from camera images as the state
representation, through a self-supervised autoencoder architecture. The
keypoints encode the geometric information, as well as the relationship of the
tool and target in a compact representation to ensure efficient and robust
learning. After keypoints learning, the RL step then learns the robot motion
from the extracted keypoints state representation. The keypoints and RL
learning processes are entirely done in the simulated environment. We
demonstrate the effectiveness of the proposed method on robotic manipulation
tasks including grasping and pushing, in different scenarios. We also
investigate the generalization capability of the trained model. In addition to
the robust keypoints representation, we further apply domain randomization and
adversarial training examples to achieve zero-shot sim-to-real transfer in
real-world robotic manipulation tasks.
Related papers
- Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics.
Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features.
We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z) - Affordance-Guided Reinforcement Learning via Visual Prompting [51.361977466993345]
Keypoint-based Affordance Guidance for Improvements (KAGI) is a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL.
On real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 20K online fine-tuning steps.
arXiv Detail & Related papers (2024-07-14T21:41:29Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - Active Exploration for Robotic Manipulation [40.39182660794481]
This paper proposes a model-based active exploration approach that enables efficient learning in sparse-reward robotic manipulation tasks.
We evaluate our proposed algorithm in simulation and on a real robot, trained from scratch with our method.
arXiv Detail & Related papers (2022-10-23T18:07:51Z) - Masked World Models for Visual Control [90.13638482124567]
We introduce a visual model-based RL framework that decouples visual representation learning and dynamics learning.
We demonstrate that our approach achieves state-of-the-art performance on a variety of visual robotic tasks.
arXiv Detail & Related papers (2022-06-28T18:42:27Z) - Self-Supervised Learning of Multi-Object Keypoints for Robotic
Manipulation [8.939008609565368]
In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning.
We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning approaches, and demonstrate its flexibility and effectiveness for sample-efficient policy learning.
arXiv Detail & Related papers (2022-05-17T13:15:07Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Pose Estimation for Robot Manipulators via Keypoint Optimization and
Sim-to-Real Transfer [10.369766652751169]
Keypoint detection is an essential building block for many robotic applications.
Deep learning methods have the ability to detect user-defined keypoints in a marker-less manner.
We propose a new and autonomous way to define the keypoint locations that overcomes these challenges.
arXiv Detail & Related papers (2020-10-15T22:38:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.