Related papers: How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation

How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation

URL: http://arxiv.org/abs/2205.03353v1
Date: Fri, 6 May 2022 16:38:59 GMT
Title: How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation
Authors: Alex X. Lee, Coline Devin, Jost Tobias Springenberg, Yuxiang Zhou, Thomas Lampe, Abbas Abdolmaleki, Konstantinos Bousmalis
Abstract summary: Reinforcement learning (RL) has been shown to be effective at learning control from experience. RL typically requires a large amount of online interaction with the environment. We investigate ways to minimize online interactions in a target task, by reusing a suboptimal policy.
Score: 17.562522787934178
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning (RL) has been shown to be effective at learning control from experience. However, RL typically requires a large amount of online interaction with the environment. This limits its applicability to real-world settings, such as in robotics, where such interaction is expensive. In this work we investigate ways to minimize online interactions in a target task, by reusing a suboptimal policy we might have access to, for example from training on related prior tasks, or in simulation. To this end, we develop two RL algorithms that can speed up training by using not only the action distributions of teacher policies, but also data collected by such policies on the task at hand. We conduct a thorough experimental study of how to use suboptimal teachers on a challenging robotic manipulation benchmark on vision-based stacking with diverse objects. We compare our methods to offline, online, offline-to-online, and kickstarting RL algorithms. By doing so, we find that training on data from both the teacher and student, enables the best performance for limited data budgets. We examine how to best allocate a limited data budget -- on the target task -- between the teacher and the student policy, and report experiments using varying budgets, two teachers with different degrees of suboptimality, and five stacking tasks that require a diverse set of behaviors. Our analysis, both in simulation and in the real world, shows that our approach is the best across data budgets, while standard offline RL from teacher rollouts is surprisingly effective when enough data is given.

Related papers

Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments [11.272582555795989]
We study whether agents can leverage offline data in the form of trajectories to improve the sample-efficiency in procedurally generated environments. We consider two settings of using IL from offline data for RL: (1) pre-training a policy before online RL training and (2) concurrently training a policy with online RL and IL from offline data.
arXiv Detail & Related papers (2023-04-18T16:23:15Z)
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials [97.95400776235736]
We present a framework based on offline RL that attempts to effectively learn new tasks. It combines pre-training on existing robotic datasets with rapid fine-tuning on a new task, with as few as 10 demonstrations. To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot.
arXiv Detail & Related papers (2022-10-11T06:30:53Z)
Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information [5.604859261995801]
We propose a unified offline-to-online RL framework that resolves the transition performance drop issue. We introduce goal-aware state information to the RL agent, which can greatly reduce task complexity and accelerate policy learning. Our framework achieves great training efficiency and performance compared with the state-of-the-art methods in multiple robotic manipulation tasks.
arXiv Detail & Related papers (2021-10-21T05:34:25Z)
A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction. We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems. We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z)
DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning [6.9884912034790405]
We propose a framework, Data CUrriculum for Reinforcement learning (DCUR), which first trains teachers using online deep RL. Then, students learn by running either offline RL or by using teacher data in combination with a small amount of self-generated data.
arXiv Detail & Related papers (2021-09-15T15:39:46Z)
RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods. RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems. We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z)
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience. Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.