Why Online Reinforcement Learning is Causal
- URL: http://arxiv.org/abs/2403.04221v2
- Date: Wed, 10 Jul 2024 23:51:52 GMT
- Title: Why Online Reinforcement Learning is Causal
- Authors: Oliver Schulte, Pascal Poupart,
- Abstract summary: Reinforcement learning (RL) and causal modelling naturally complement each other.
This paper examines which reinforcement learning settings we can expect to benefit from causal modelling.
- Score: 31.59766909722592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select interventions that maximize the rewards the agent receives from the environment. Reinforcement learning includes the two most powerful sources of information for estimating causal relationships: temporal ordering and the ability to act on an environment. This paper examines which reinforcement learning settings we can expect to benefit from causal modelling, and how. In online learning, the agent has the ability to interact directly with their environment, and learn from exploring it. Our main argument is that in online learning, conditional probabilities are causal, and therefore offline RL is the setting where causal learning has the most potential to make a difference. Essentially, the reason is that when an agent learns from their {\em own} experience, there are no unobserved confounders that influence both the agent's own exploratory actions and the rewards they receive. Our paper formalizes this argument. For offline RL, where an agent may and typically does learn from the experience of {\em others}, we describe previous and new methods for leveraging a causal model, including support for counterfactual queries.
Related papers
- Semifactual Explanations for Reinforcement Learning [1.5320737596132754]
Reinforcement Learning (RL) is a learning paradigm in which the agent learns from its environment through trial and error.
Deep reinforcement learning (DRL) algorithms represent the agent's policies using neural networks, making their decisions difficult to interpret.
Explaining the behaviour of DRL agents is necessary to advance user trust, increase engagement, and facilitate integration with real-life tasks.
arXiv Detail & Related papers (2024-09-09T08:37:47Z) - Can Active Sampling Reduce Causal Confusion in Offline Reinforcement
Learning? [58.942118128503104]
Causal confusion is a phenomenon where an agent learns a policy that reflects imperfect spurious correlations in the data.
This phenomenon is particularly pronounced in domains such as robotics.
In this paper, we study causal confusion in offline reinforcement learning.
arXiv Detail & Related papers (2023-12-28T17:54:56Z) - Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph.
We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs.
The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally
Inattentive Reinforcement Learning [85.86440477005523]
We study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model.
RIRL models the cost of cognitive information processing using mutual information.
We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions.
arXiv Detail & Related papers (2022-01-18T20:54:00Z) - RvS: What is Essential for Offline RL via Supervised Learning? [77.91045677562802]
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
In every environment suite we consider simply maximizing likelihood with two-layer feedforward is competitive.
They also probe the limits of existing RvS methods, which are comparatively weak on random data.
arXiv Detail & Related papers (2021-12-20T18:55:16Z) - Systematic Evaluation of Causal Discovery in Visual Model Based
Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure.
Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs.
In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z) - Causal Reinforcement Learning using Observational and Interventional
Data [14.856472820492364]
Learning efficiently a causal model of the environment is a key challenge of model RL agents operating in POMDPs.
We consider a scenario where the learning agent has the ability to collect online experiences through direct interactions with the environment.
We then ask the following questions: can the online and offline experiences be safely combined for learning a causal model.
arXiv Detail & Related papers (2021-06-28T06:58:20Z) - Causality in Neural Networks -- An Extended Abstract [0.0]
Causal reasoning is the main learning and explanation tool used by humans.
Introducing the ideas of causality to machine learning helps in providing better learning and explainable models.
arXiv Detail & Related papers (2021-06-03T09:52:36Z) - To do or not to do: finding causal relations in smart homes [2.064612766965483]
This paper introduces a new way to learn causal models from a mixture of experiments on the environment and observational data.
The core of our method is the use of selected interventions, especially our learning takes into account the variables where it is impossible to intervene.
We use our method on a smart home simulation, a use case where knowing causal relations pave the way towards explainable systems.
arXiv Detail & Related papers (2021-05-20T22:36:04Z) - Causal Curiosity: RL Agents Discovering Self-supervised Experiments for
Causal Representation Learning [24.163616087447874]
We introduce em causal curiosity, a novel intrinsic reward.
We show that it allows our agents to learn optimal sequences of actions.
We also show that the knowledge of causal factor representations aids zero-shot learning for more complex tasks.
arXiv Detail & Related papers (2020-10-07T02:07:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.