A Workflow for Offline Model-Free Robotic Reinforcement Learning
- URL: http://arxiv.org/abs/2109.10813v2
- Date: Thu, 23 Sep 2021 17:34:23 GMT
- Title: A Workflow for Offline Model-Free Robotic Reinforcement Learning
- Authors: Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine
- Abstract summary: offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
- Score: 117.07743713715291
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Offline reinforcement learning (RL) enables learning control policies by
utilizing only prior experience, without any online interaction. This can allow
robots to acquire generalizable skills from large and diverse datasets, without
any costly or unsafe online data collection. Despite recent algorithmic
advances in offline RL, applying these methods to real-world problems has
proven challenging. Although offline RL methods can learn from prior data,
there is no clear and well-understood process for making various design
choices, from model architecture to algorithm hyperparameters, without actually
evaluating the learned policies online. In this paper, our aim is to develop a
practical workflow for using offline RL analogous to the relatively
well-understood workflows for supervised learning problems. To this end, we
devise a set of metrics and conditions that can be tracked over the course of
offline training, and can inform the practitioner about how the algorithm and
model architecture should be adjusted to improve final performance. Our
workflow is derived from a conceptual understanding of the behavior of
conservative offline RL algorithms and cross-validation in supervised learning.
We demonstrate the efficacy of this workflow in producing effective policies
without any online tuning, both in several simulated robotic learning scenarios
and for three tasks on two distinct real robots, focusing on learning
manipulation skills with raw image observations with sparse binary rewards.
Explanatory video and additional results can be found at
sites.google.com/view/offline-rl-workflow
Related papers
- MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot
Learning [52.101643259906915]
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations.
Existing model-based offline RL methods are not suitable for offline-to-online fine-tuning in high-dimensional domains.
We propose an on-policy model-based method that can efficiently reuse prior data through model-based value expansion and policy regularization.
arXiv Detail & Related papers (2024-01-06T21:04:31Z) - Finetuning Offline World Models in the Real World [13.46766121896684]
Reinforcement Learning (RL) is notoriously data-inefficient, which makes training on a real robot difficult.
offline RL has been proposed as a framework for training RL policies on pre-existing datasets without any online interaction.
In this work, we consider the problem of pretraining a world model with offline data collected on a real robot, and then finetuning the model on online data collected by planning with the learned model.
arXiv Detail & Related papers (2023-10-24T17:46:12Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Bridging the Gap Between Offline and Online Reinforcement Learning
Evaluation Methodologies [6.303272140868826]
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces.
Current deep RL algorithms require a tremendous amount of environment interactions for learning.
offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data.
arXiv Detail & Related papers (2022-12-15T20:36:10Z) - Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels.
We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z) - How to Spend Your Robot Time: Bridging Kickstarting and Offline
Reinforcement Learning for Vision-based Robotic Manipulation [17.562522787934178]
Reinforcement learning (RL) has been shown to be effective at learning control from experience.
RL typically requires a large amount of online interaction with the environment.
We investigate ways to minimize online interactions in a target task, by reusing a suboptimal policy.
arXiv Detail & Related papers (2022-05-06T16:38:59Z) - Efficient Robotic Manipulation Through Offline-to-Online Reinforcement
Learning and Goal-Aware State Information [5.604859261995801]
We propose a unified offline-to-online RL framework that resolves the transition performance drop issue.
We introduce goal-aware state information to the RL agent, which can greatly reduce task complexity and accelerate policy learning.
Our framework achieves great training efficiency and performance compared with the state-of-the-art methods in multiple robotic manipulation tasks.
arXiv Detail & Related papers (2021-10-21T05:34:25Z) - Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions.
We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.