Interpretable performance analysis towards offline reinforcement
learning: A dataset perspective
- URL: http://arxiv.org/abs/2105.05473v1
- Date: Wed, 12 May 2021 07:17:06 GMT
- Title: Interpretable performance analysis towards offline reinforcement
learning: A dataset perspective
- Authors: Chenyang Xi, Bo Tang, Jiajun Shen, Xinfu Liu, Feiyu Xiong, Xueying Li
- Abstract summary: We propose a two-fold taxonomy for existing offline RL algorithms.
We explore the correlation between the performance of different types of algorithms and the distribution of actions under states.
We create a benchmark platform on the Atari domain, entitled easy go (RLEG), at an estimated cost of more than 0.3 million dollars.
- Score: 6.526790418943535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Offline reinforcement learning (RL) has increasingly become the focus of the
artificial intelligent research due to its wide real-world applications where
the collection of data may be difficult, time-consuming, or costly. In this
paper, we first propose a two-fold taxonomy for existing offline RL algorithms
from the perspective of exploration and exploitation tendency. Secondly, we
derive the explicit expression of the upper bound of extrapolation error and
explore the correlation between the performance of different types of
algorithms and the distribution of actions under states. Specifically, we relax
the strict assumption on the sufficiently large amount of state-action tuples.
Accordingly, we provably explain why batch constrained Q-learning (BCQ)
performs better than other existing techniques. Thirdly, after identifying the
weakness of BCQ on dataset of low mean episode returns, we propose a modified
variant based on top return selection mechanism, which is proved to be able to
gain state-of-the-art performance on various datasets. Lastly, we create a
benchmark platform on the Atari domain, entitled RL easy go (RLEG), at an
estimated cost of more than 0.3 million dollars. We make it open-source for
fair and comprehensive competitions between offline RL algorithms with complete
datasets and checkpoints being provided.
Related papers
- D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Bridging Distributionally Robust Learning and Offline RL: An Approach to
Mitigate Distribution Shift and Partial Data Coverage [32.578787778183546]
offline reinforcement learning (RL) algorithms learn optimal polices using historical (offline) data.
One of the main challenges in offline RL is the distribution shift.
We propose two offline RL algorithms using the distributionally robust learning (DRL) framework.
arXiv Detail & Related papers (2023-10-27T19:19:30Z) - Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced
Datasets [53.8218145723718]
offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data.
We argue that when a dataset is dominated by suboptimal trajectories, state-of-the-art offline RL algorithms do not substantially improve over the average return of trajectories in the dataset.
We present a realization of the sampling strategy and an algorithm that can be used as a plug-and-play module in standard offline RL algorithms.
arXiv Detail & Related papers (2023-10-06T17:58:14Z) - Improving and Benchmarking Offline Reinforcement Learning Algorithms [87.67996706673674]
This work aims to bridge the gaps caused by low-level choices and datasets.
We empirically investigate 20 implementation choices using three representative algorithms.
We find two variants CRR+ and CQL+ achieving new state-of-the-art on D4RL.
arXiv Detail & Related papers (2023-06-01T17:58:46Z) - Offline Equilibrium Finding [40.08360411502593]
We aim to generalize Offline RL to a multi-agent or multiplayer-game setting.
Very little research has been done in this area, as the progress is hindered by the lack of standardized datasets and meaningful benchmarks.
Our two model-based algorithms -- OEF-PSRO and OEF-CFR -- are adaptations of the widely-used equilibrium finding algorithms Deep CFR and PSRO in the context of offline learning.
arXiv Detail & Related papers (2022-07-12T03:41:06Z) - When Should We Prefer Offline Reinforcement Learning Over Behavioral
Cloning? [86.43517734716606]
offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing previously collected experience, without any online interaction.
behavioral cloning (BC) algorithms mimic a subset of the dataset via supervised learning.
We show that policies trained on sufficiently noisy suboptimal data can attain better performance than even BC algorithms with expert data.
arXiv Detail & Related papers (2022-04-12T08:25:34Z) - A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open
Problems [0.0]
Reinforcement learning (RL) has experienced a dramatic increase in popularity.
There is still a wide range of domains inaccessible to RL due to the high cost and danger of interacting with the environment.
offline RL is a paradigm that learns exclusively from static datasets of previously collected interactions.
arXiv Detail & Related papers (2022-03-02T20:05:11Z) - Continuous Doubly Constrained Batch Reinforcement Learning [93.23842221189658]
We propose an algorithm for batch RL, where effective policies are learned using only a fixed offline dataset instead of online interactions with the environment.
The limited data in batch RL produces inherent uncertainty in value estimates of states/actions that were insufficiently represented in the training data.
We propose to mitigate this issue via two straightforward penalties: a policy-constraint to reduce this divergence and a value-constraint that discourages overly optimistic estimates.
arXiv Detail & Related papers (2021-02-18T08:54:14Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.