A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open
Problems
- URL: http://arxiv.org/abs/2203.01387v3
- Date: Wed, 19 Apr 2023 00:30:20 GMT
- Title: A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open
Problems
- Authors: Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo, Esther Luna
Colombini
- Abstract summary: Reinforcement learning (RL) has experienced a dramatic increase in popularity.
There is still a wide range of domains inaccessible to RL due to the high cost and danger of interacting with the environment.
offline RL is a paradigm that learns exclusively from static datasets of previously collected interactions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the widespread adoption of deep learning, reinforcement learning (RL)
has experienced a dramatic increase in popularity, scaling to previously
intractable problems, such as playing complex games from pixel observations,
sustaining conversations with humans, and controlling robotic agents. However,
there is still a wide range of domains inaccessible to RL due to the high cost
and danger of interacting with the environment. Offline RL is a paradigm that
learns exclusively from static datasets of previously collected interactions,
making it feasible to extract policies from large and diverse training
datasets. Effective offline RL algorithms have a much wider range of
applications than online RL, being particularly appealing for real-world
applications, such as education, healthcare, and robotics. In this work, we
contribute with a unifying taxonomy to classify offline RL methods.
Furthermore, we provide a comprehensive review of the latest algorithmic
breakthroughs in the field using a unified notation as well as a review of
existing benchmarks' properties and shortcomings. Additionally, we provide a
figure that summarizes the performance of each method and class of methods on
different dataset properties, equipping researchers with the tools to decide
which type of algorithm is best suited for the problem at hand and identify
which classes of algorithms look the most promising. Finally, we provide our
perspective on open problems and propose future research directions for this
rapidly growing field.
Related papers
- D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online.
We extensively ablate these design choices, demonstrating the key factors that most affect performance.
We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - Bridging the Gap Between Offline and Online Reinforcement Learning
Evaluation Methodologies [6.303272140868826]
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces.
Current deep RL algorithms require a tremendous amount of environment interactions for learning.
offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data.
arXiv Detail & Related papers (2022-12-15T20:36:10Z) - Offline Equilibrium Finding [40.08360411502593]
We aim to generalize Offline RL to a multi-agent or multiplayer-game setting.
Very little research has been done in this area, as the progress is hindered by the lack of standardized datasets and meaningful benchmarks.
Our two model-based algorithms -- OEF-PSRO and OEF-CFR -- are adaptations of the widely-used equilibrium finding algorithms Deep CFR and PSRO in the context of offline learning.
arXiv Detail & Related papers (2022-07-12T03:41:06Z) - Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions.
We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z) - Critic Regularized Regression [70.8487887738354]
We propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR)
We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces.
arXiv Detail & Related papers (2020-06-26T17:50:26Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.