ShinRL: A Library for Evaluating RL Algorithms from Theoretical and
Practical Perspectives
- URL: http://arxiv.org/abs/2112.04123v1
- Date: Wed, 8 Dec 2021 05:34:46 GMT
- Title: ShinRL: A Library for Evaluating RL Algorithms from Theoretical and
Practical Perspectives
- Authors: Toshinori Kitamura, Ryo Yonetani
- Abstract summary: We present ShinRL, an open-source library for evaluation of reinforcement learning (RL) algorithms.
ShinRL provides an RL environment interface that can compute metrics for delving into the behaviors of RL algorithms.
We show how combining these two features of ShinRL makes it easier to analyze the behavior of deep Q learning.
- Score: 11.675763847424786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ShinRL, an open-source library specialized for the evaluation of
reinforcement learning (RL) algorithms from both theoretical and practical
perspectives. Existing RL libraries typically allow users to evaluate practical
performances of deep RL algorithms through returns. Nevertheless, these
libraries are not necessarily useful for analyzing if the algorithms perform as
theoretically expected, such as if Q learning really achieves the optimal Q
function. In contrast, ShinRL provides an RL environment interface that can
compute metrics for delving into the behaviors of RL algorithms, such as the
gap between learned and the optimal Q values and state visitation frequencies.
In addition, we introduce a flexible solver interface for evaluating both
theoretically justified algorithms (e.g., dynamic programming and tabular RL)
and practically effective ones (i.e., deep RL, typically with some additional
extensions and regularizations) in a consistent fashion. As a case study, we
show that how combining these two features of ShinRL makes it easier to analyze
the behavior of deep Q learning. Furthermore, we demonstrate that ShinRL can be
used to empirically validate recent theoretical findings such as the effect of
KL regularization for value iteration and for deep Q learning, and the
robustness of entropy-regularized policies to adversarial rewards. The source
code for ShinRL is available on GitHub: https://github.com/omron-sinicx/ShinRL.
Related papers
- Is Value Learning Really the Main Bottleneck in Offline RL? [70.54708989409409]
We show that the choice of a policy extraction algorithm significantly affects the performance and scalability of offline RL.
We propose two simple test-time policy improvement methods and show that these methods lead to better performance.
arXiv Detail & Related papers (2024-06-13T17:07:49Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement
Learning [41.971465819626005]
We present Open RL Benchmark, a set of fully tracked RL experiments.
Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data.
Special care is taken to ensure that each experiment is precisely reproducible.
arXiv Detail & Related papers (2024-02-05T14:32:00Z) - The Effective Horizon Explains Deep RL Performance in Stochastic Environments [21.148001945560075]
Reinforcement learning (RL) theory has largely focused on proving mini complexity sample bounds.
We introduce a new RL algorithm, SQIRL, that iteratively learns a nearoptimal policy by exploring randomly to collect rollouts.
We leverage SQIRL to derive instance-dependent sample complexity bounds for RL that are exponential only in an "effective horizon" look-ahead and on the complexity of the class used for approximation.
arXiv Detail & Related papers (2023-12-13T18:58:56Z) - SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores [13.948640763797776]
We present a novel abstraction on the dataflows of RL training, which unifies diverse RL training applications into a general framework.
We develop a scalable, efficient, and distributed RL system called ReaLly scalableRL, which allows efficient and massively parallelized training.
SRL is the first in the academic community to perform RL experiments at a large scale with over 15k CPU cores.
arXiv Detail & Related papers (2023-06-29T05:16:25Z) - RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$ [12.111848705677142]
We propose RL$3$, a hybrid approach that incorporates action-values, learned per task through traditional RL, in the inputs to meta-RL.
We show that RL$3$ earns greater cumulative reward in the long term, compared to RL$2$, while maintaining data-efficiency in the short term, and generalizes better to out-of-distribution tasks.
arXiv Detail & Related papers (2023-06-28T04:16:16Z) - LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement
Learning [78.2286146954051]
LCRL implements model-free Reinforcement Learning (RL) algorithms over unknown Decision Processes (MDPs)
We present case studies to demonstrate the applicability, ease of use, scalability, and performance of LCRL.
arXiv Detail & Related papers (2022-09-21T13:21:00Z) - Contrastive Learning as Goal-Conditioned Reinforcement Learning [147.28638631734486]
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable.
We show (contrastive) representation learning methods can be cast as RL algorithms in their own right.
arXiv Detail & Related papers (2022-06-15T14:34:15Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - RL-DARTS: Differentiable Architecture Search for Reinforcement Learning [62.95469460505922]
We introduce RL-DARTS, one of the first applications of Differentiable Architecture Search (DARTS) in reinforcement learning (RL)
By replacing the image encoder with a DARTS supernet, our search method is sample-efficient, requires minimal extra compute resources, and is also compatible with off-policy and on-policy RL algorithms, needing only minor changes in preexisting code.
We show that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
arXiv Detail & Related papers (2021-06-04T03:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.