Towards Continual Reinforcement Learning: A Review and Perspectives
- URL: http://arxiv.org/abs/2012.13490v1
- Date: Fri, 25 Dec 2020 02:35:27 GMT
- Title: Towards Continual Reinforcement Learning: A Review and Perspectives
- Authors: Khimya Khetarpal, Matthew Riemer, Irina Rish, Doina Precup
- Abstract summary: We aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL)
While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners.
These include applications such as those in the fields of healthcare, education, logistics, and robotics.
- Score: 69.48324517535549
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this article, we aim to provide a literature review of different
formulations and approaches to continual reinforcement learning (RL), also
known as lifelong or non-stationary RL. We begin by discussing our perspective
on why RL is a natural fit for studying continual learning. We then provide a
taxonomy of different continual RL formulations and mathematically characterize
the non-stationary dynamics of each setting. We go on to discuss evaluation of
continual RL agents, providing an overview of benchmarks used in the literature
and important metrics for understanding agent performance. Finally, we
highlight open problems and challenges in bridging the gap between the current
state of continual RL and findings in neuroscience. While still in its early
days, the study of continual RL has the promise to develop better incremental
reinforcement learners that can function in increasingly realistic applications
where non-stationarity plays a vital role. These include applications such as
those in the fields of healthcare, education, logistics, and robotics.
Related papers
- Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review [50.67937325077047]
This paper is devoted to a comprehensive review of realizing the sample efficiency and generalization of RL algorithms through transfer and inverse reinforcement learning (T-IRL)
Our findings denote that a majority of recent research works have dealt with the aforementioned challenges by utilizing human-in-the-loop and sim-to-real strategies.
Under the IRL structure, training schemes that require a low number of experience transitions and extension of such frameworks to multi-agent and multi-intention problems have been the priority of researchers in recent years.
arXiv Detail & Related papers (2024-11-15T15:18:57Z) - Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination [7.162274565861427]
offline reinforcement learning in dynamic treatment regimes presents a mix of unprecedented opportunities and challenges.
We argue for a reassessment of applying RL in dynamic treatment regimes citing concerns such as inconsistent and potentially inconclusive evaluation metrics.
We demonstrate that the performance of RL algorithms can significantly vary with changes in evaluation metrics and Markov Decision Process (MDP) formulations.
arXiv Detail & Related papers (2024-05-28T20:03:18Z) - Using Think-Aloud Data to Understand Relations between Self-Regulation
Cycle Characteristics and Student Performance in Intelligent Tutoring Systems [15.239133633467672]
The present study investigates SRL behaviors in relationship to learners' moment-by-moment performance.
We demonstrate the feasibility of labeling SRL behaviors based on AI-generated think-aloud transcripts.
Students' actions during earlier, process-heavy stages of SRL cycles exhibited lower moment-by-moment correctness during problem-solving than later SRL cycle stages.
arXiv Detail & Related papers (2023-12-09T20:36:58Z) - A Survey on Causal Reinforcement Learning [41.645270300009436]
We offer a review of Causal Reinforcement Learning (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL.
In particular, we divide existing CRL approaches into two categories according to whether their causality-based information is given in advance or not.
We analyze each category in terms of the formalization of different models, ranging from the Markov Decision Process (MDP), Partially Observed Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment Regime (DTR)
arXiv Detail & Related papers (2023-02-10T12:25:08Z) - A Comprehensive Survey of Continual Learning: Theory, Method and
Application [64.23253420555989]
We present a comprehensive survey of continual learning, seeking to bridge the basic settings, theoretical foundations, representative methods, and practical applications.
We summarize the general objectives of continual learning as ensuring a proper stability-plasticity trade-off and an adequate intra/inter-task generalizability in the context of resource efficiency.
arXiv Detail & Related papers (2023-01-31T11:34:56Z) - Survey on Fair Reinforcement Learning: Theory and Practice [9.783469272270896]
We provide an extensive overview of fairness approaches that have been implemented via a reinforcement learning (RL) framework.
We discuss various practical applications in which RL methods have been applied to achieve a fair solution with high accuracy.
We highlight a few major issues to explore in order to advance the field of fair-RL.
arXiv Detail & Related papers (2022-05-20T09:07:28Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Continual World: A Robotic Benchmark For Continual Reinforcement
Learning [17.77261981963946]
We argue that understanding the right trade-off is conceptually and computationally challenging.
We propose a benchmark consisting of realistic and meaningfully diverse robotic tasks built on top of Meta-World as a testbed.
arXiv Detail & Related papers (2021-05-23T11:33:04Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z) - Transient Non-Stationarity and Generalisation in Deep Reinforcement
Learning [67.34810824996887]
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments.
We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents.
arXiv Detail & Related papers (2020-06-10T13:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.