Efficient Robotic Manipulation Through Offline-to-Online Reinforcement
Learning and Goal-Aware State Information
- URL: http://arxiv.org/abs/2110.10905v1
- Date: Thu, 21 Oct 2021 05:34:25 GMT
- Title: Efficient Robotic Manipulation Through Offline-to-Online Reinforcement
Learning and Goal-Aware State Information
- Authors: Jin Li, Xianyuan Zhan, Zixu Xiao, Guyue Zhou
- Abstract summary: We propose a unified offline-to-online RL framework that resolves the transition performance drop issue.
We introduce goal-aware state information to the RL agent, which can greatly reduce task complexity and accelerate policy learning.
Our framework achieves great training efficiency and performance compared with the state-of-the-art methods in multiple robotic manipulation tasks.
- Score: 5.604859261995801
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end learning robotic manipulation with high data efficiency is one of
the key challenges in robotics. The latest methods that utilize human
demonstration data and unsupervised representation learning has proven to be a
promising direction to improve RL learning efficiency. The use of demonstration
data also allows "warming-up" the RL policies using offline data with imitation
learning or the recently emerged offline reinforcement learning algorithms.
However, existing works often treat offline policy learning and online
exploration as two separate processes, which are often accompanied by severe
performance drop during the offline-to-online transition. Furthermore, many
robotic manipulation tasks involve complex sub-task structures, which are very
challenging to be solved in RL with sparse reward. In this work, we propose a
unified offline-to-online RL framework that resolves the transition performance
drop issue. Additionally, we introduce goal-aware state information to the RL
agent, which can greatly reduce task complexity and accelerate policy learning.
Combined with an advanced unsupervised representation learning module, our
framework achieves great training efficiency and performance compared with the
state-of-the-art methods in multiple robotic manipulation tasks.
Related papers
- Finetuning Offline World Models in the Real World [13.46766121896684]
Reinforcement Learning (RL) is notoriously data-inefficient, which makes training on a real robot difficult.
offline RL has been proposed as a framework for training RL policies on pre-existing datasets without any online interaction.
In this work, we consider the problem of pretraining a world model with offline data collected on a real robot, and then finetuning the model on online data collected by planning with the learned model.
arXiv Detail & Related papers (2023-10-24T17:46:12Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Benchmarking Offline Reinforcement Learning on Real-Robot Hardware [35.29390454207064]
Dexterous manipulation in particular remains an open problem in its general form.
We propose a benchmark including a large collection of data for offline learning from a dexterous manipulation platform on two tasks.
We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
arXiv Detail & Related papers (2023-07-28T17:29:49Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.