A Review of Off-Policy Evaluation in Reinforcement Learning
- URL: http://arxiv.org/abs/2212.06355v1
- Date: Tue, 13 Dec 2022 03:38:57 GMT
- Title: A Review of Off-Policy Evaluation in Reinforcement Learning
- Authors: Masatoshi Uehara, Chengchun Shi, Nathan Kallus
- Abstract summary: We primarily focus on off-policy evaluation (OPE), one of the most fundamental topics inReinforcement learning.
We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions.
- Score: 72.82459524257446
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) is one of the most vibrant research frontiers in
machine learning and has been recently applied to solve a number of challenging
problems. In this paper, we primarily focus on off-policy evaluation (OPE), one
of the most fundamental topics in RL. In recent years, a number of OPE methods
have been developed in the statistics and computer science literature. We
provide a discussion on the efficiency bound of OPE, some of the existing
state-of-the-art OPE methods, their statistical properties and some other
related research directions that are currently actively explored.
Related papers
- Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review [50.67937325077047]
This paper is devoted to a comprehensive review of realizing the sample efficiency and generalization of RL algorithms through transfer and inverse reinforcement learning (T-IRL)
Our findings denote that a majority of recent research works have dealt with the aforementioned challenges by utilizing human-in-the-loop and sim-to-real strategies.
Under the IRL structure, training schemes that require a low number of experience transitions and extension of such frameworks to multi-agent and multi-intention problems have been the priority of researchers in recent years.
arXiv Detail & Related papers (2024-11-15T15:18:57Z) - Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences [24.361550505778155]
Offcommerce evaluation (OPE) is widely applied in sectors such as pharmaceuticals and e-policy-policy.
This paper introduces a causal deepset framework that relaxes several key structural assumptions.
We present novel algorithms that incorporate the PI assumption into OPE and thoroughly examine their theoretical foundations.
arXiv Detail & Related papers (2024-07-25T10:02:11Z) - A Survey on Few-Shot Class-Incremental Learning [11.68962265057818]
Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks.
This paper provides a comprehensive survey on FSCIL.
FSCIL has achieved impressive achievements in various fields of computer vision.
arXiv Detail & Related papers (2023-04-17T10:15:08Z) - Knowledge-enhanced Neural Machine Reasoning: A Review [67.51157900655207]
We introduce a novel taxonomy that categorizes existing knowledge-enhanced methods into two primary categories and four subcategories.
We elucidate the current application domains and provide insight into promising prospects for future research.
arXiv Detail & Related papers (2023-02-04T04:54:30Z) - An Investigation of Replay-based Approaches for Continual Learning [79.0660895390689]
Continual learning (CL) is a major challenge of machine learning (ML) and describes the ability to learn several tasks sequentially without catastrophic forgetting (CF)
Several solution classes have been proposed, of which so-called replay-based approaches seem very promising due to their simplicity and robustness.
We empirically investigate replay-based approaches of continual learning and assess their potential for applications.
arXiv Detail & Related papers (2021-08-15T15:05:02Z) - Distributed Deep Reinforcement Learning: An Overview [0.0]
In this article, we provide a survey of the role of the distributed approaches in DRL.
We overview the state of the field, by studying the key research works that have a significant impact on how we can use distributed methods in DRL.
Also, we evaluate these methods on different tasks and compare their performance with each other and with single actor and learner agents.
arXiv Detail & Related papers (2020-11-22T13:24:35Z) - What Matters In On-Policy Reinforcement Learning? A Large-Scale
Empirical Study [50.79125250286453]
On-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks.
But state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents.
These choices are usually not extensively discussed in the literature, leading to discrepancy between published descriptions of algorithms and their implementations.
We implement >50 such choices'' in a unified on-policy RL framework, allowing us to investigate their impact in a large-scale empirical study.
arXiv Detail & Related papers (2020-06-10T17:59:03Z) - Reinforcement Learning via Fenchel-Rockafellar Duality [97.86417365464068]
We review basic concepts of convex duality, focusing on the very general and supremely useful Fenchel-Rockafellar duality.
We summarize how this duality may be applied to a variety of reinforcement learning settings, including policy evaluation or optimization, online or offline learning, and discounted or undiscounted rewards.
arXiv Detail & Related papers (2020-01-07T02:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.