CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning
- URL: http://arxiv.org/abs/2110.02102v1
- Date: Tue, 5 Oct 2021 15:04:01 GMT
- Title: CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning
- Authors: Carolin Benjamins, Theresa Eimer, Frederik Schubert, Andr\'e
Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer
- Abstract summary: We present CARL, a collection of well-known RL environments extended to contextual RL problems.
We provide first evidence that disentangling representation learning of the states from the policy learning with the context facilitates better generalization.
- Score: 45.52724876199729
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Reinforcement Learning has made great strides towards solving ever more
complicated tasks, many algorithms are still brittle to even slight changes in
their environment. This is a limiting factor for real-world applications of RL.
Although the research community continuously aims at improving both robustness
and generalization of RL algorithms, unfortunately it still lacks an
open-source set of well-defined benchmark problems based on a consistent
theoretical framework, which allows comparing different approaches in a fair,
reliable and reproducibleway. To fill this gap, we propose CARL, a collection
of well-known RL environments extended to contextual RL problems to study
generalization. We show the urgent need of such benchmarks by demonstrating
that even simple toy environments become challenging for commonly used
approaches if different contextual instances of this task have to be
considered. Furthermore, CARL allows us to provide first evidence that
disentangling representation learning of the states from the policy learning
with the context facilitates better generalization. By providing variations of
diverse benchmarks from classic control, physical simulations, games and a
real-world application of RNA design, CARL will allow the community to derive
many more such insights on a solid empirical foundation.
Related papers
- Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach [2.3020018305241337]
This paper is the first to propose considering the RRL problems within the positional differential game theory.
Namely, we prove that under Isaacs's condition, the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations.
We present the Isaacs Deep Q-Network algorithms and demonstrate their superiority compared to other baseline RRL and Multi-Agent RL algorithms in various environments.
arXiv Detail & Related papers (2024-05-03T12:21:43Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - Towards an Information Theoretic Framework of Context-Based Offline
Meta-Reinforcement Learning [50.976910714839065]
Context-based OMRL (COMRL) as a popular paradigm, aims to learn a universal policy conditioned on effective task representations.
We show that COMRL algorithms are essentially optimizing the same mutual information objective between the task variable $boldsymbolM$ and its latent representation $boldsymbolZ$ by implementing various approximate bounds.
Based on the theoretical insight and the information bottleneck principle, we arrive at a novel algorithm dubbed UNICORN, which exhibits remarkable generalization across a broad spectrum of RL benchmarks.
arXiv Detail & Related papers (2024-02-04T09:58:42Z) - Blending Imitation and Reinforcement Learning for Robust Policy
Improvement [16.588397203235296]
Imitation learning (IL) utilizes oracles to improve sample efficiency.
RPI draws on the strengths of IL, using oracle queries to facilitate exploration.
RPI is capable of learning from and improving upon a diverse set of black-box oracles.
arXiv Detail & Related papers (2023-10-03T01:55:54Z) - ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource
Allocation [1.6058099298620425]
ContainerGym is a benchmark for reinforcement learning inspired by a real-world industrial resource allocation task.
The proposed benchmark encodes challenges commonly encountered in real-world sequential decision making problems.
It can be configured to instantiate problems of varying degrees of difficulty.
arXiv Detail & Related papers (2023-07-06T13:44:29Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner.
We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Incorporating Relational Background Knowledge into Reinforcement
Learning via Differentiable Inductive Logic Programming [8.122270502556374]
We propose a novel deepReinforcement Learning (RRL) based on a differentiable Inductive Logic Programming (ILP)
We show the efficacy of this novel RRL framework using environments such as BoxWorld, GridWorld as well as relational reasoning for the Sort-of-CLEVR dataset.
arXiv Detail & Related papers (2020-03-23T16:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.