A Survey on Causal Reinforcement Learning
- URL: http://arxiv.org/abs/2302.05209v3
- Date: Thu, 1 Jun 2023 13:43:50 GMT
- Title: A Survey on Causal Reinforcement Learning
- Authors: Yan Zeng, Ruichu Cai, Fuchun Sun, Libo Huang, Zhifeng Hao
- Abstract summary: We offer a review of Causal Reinforcement Learning (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL.
In particular, we divide existing CRL approaches into two categories according to whether their causality-based information is given in advance or not.
We analyze each category in terms of the formalization of different models, ranging from the Markov Decision Process (MDP), Partially Observed Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment Regime (DTR)
- Score: 41.645270300009436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Reinforcement Learning (RL) achieves tremendous success in sequential
decision-making problems of many domains, it still faces key challenges of data
inefficiency and the lack of interpretability. Interestingly, many researchers
have leveraged insights from the causality literature recently, bringing forth
flourishing works to unify the merits of causality and address well the
challenges from RL. As such, it is of great necessity and significance to
collate these Causal Reinforcement Learning (CRL) works, offer a review of CRL
methods, and investigate the potential functionality from causality toward RL.
In particular, we divide existing CRL approaches into two categories according
to whether their causality-based information is given in advance or not. We
further analyze each category in terms of the formalization of different
models, ranging from the Markov Decision Process (MDP), Partially Observed
Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment
Regime (DTR). Moreover, we summarize the evaluation matrices and open sources
while we discuss emerging applications, along with promising prospects for the
future development of CRL.
Related papers
- Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination [7.162274565861427]
offline reinforcement learning in dynamic treatment regimes presents a mix of unprecedented opportunities and challenges.
We argue for a reassessment of applying RL in dynamic treatment regimes citing concerns such as inconsistent and potentially inconclusive evaluation metrics.
We demonstrate that the performance of RL algorithms can significantly vary with changes in evaluation metrics and Markov Decision Process (MDP) formulations.
arXiv Detail & Related papers (2024-05-28T20:03:18Z) - Evolutionary Reinforcement Learning: A Survey [31.112066295496003]
Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments.
This article presents a comprehensive survey of state-of-the-art methods for integrating EC into RL, referred to as evolutionary reinforcement learning (EvoRL)
arXiv Detail & Related papers (2023-03-07T01:38:42Z) - A Survey on Causal Representation Learning and Future Work for Medical
Image Analysis [0.0]
Causal Representation Learning has recently been a promising direction to address the causal relationship problem in vision understanding.
This survey presents recent advances in CRL in vision.
arXiv Detail & Related papers (2022-10-28T10:15:36Z) - Offline Reinforcement Learning with Instrumental Variables in Confounded
Markov Decision Processes [93.61202366677526]
We study the offline reinforcement learning (RL) in the face of unmeasured confounders.
We propose various policy learning methods with the finite-sample suboptimality guarantee of finding the optimal in-class policy.
arXiv Detail & Related papers (2022-09-18T22:03:55Z) - Pessimistic Model Selection for Offline Deep Reinforcement Learning [56.282483586473816]
Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications.
One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL.
We propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee.
arXiv Detail & Related papers (2021-11-29T06:29:49Z) - Causal Inference Q-Network: Toward Resilient Reinforcement Learning [57.96312207429202]
We consider a resilient DRL framework with observational interferences.
Under this framework, we propose a causal inference based DRL algorithm called causal inference Q-network (CIQ)
Our experimental results show that the proposed CIQ method could achieve higher performance and more resilience against observational interferences.
arXiv Detail & Related papers (2021-02-18T23:50:20Z) - Towards Continual Reinforcement Learning: A Review and Perspectives [69.48324517535549]
We aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL)
While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners.
These include applications such as those in the fields of healthcare, education, logistics, and robotics.
arXiv Detail & Related papers (2020-12-25T02:35:27Z) - What Matters In On-Policy Reinforcement Learning? A Large-Scale
Empirical Study [50.79125250286453]
On-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks.
But state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents.
These choices are usually not extensively discussed in the literature, leading to discrepancy between published descriptions of algorithms and their implementations.
We implement >50 such choices'' in a unified on-policy RL framework, allowing us to investigate their impact in a large-scale empirical study.
arXiv Detail & Related papers (2020-06-10T17:59:03Z) - Comprehensive Review of Deep Reinforcement Learning Methods and
Applications in Economics [9.080472817672264]
DRL is characterized by scalability with the potential to be applied to high-dimensional problems in conjunction with noisy and nonlinear patterns of economic data.
The architecture of DRL applied to economic applications is investigated in order to highlight the complexity, robustness, accuracy, performance, computational tasks, risk constraints, and profitability.
arXiv Detail & Related papers (2020-03-21T14:07:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.