Learning "What-if" Explanations for Sequential Decision-Making
- URL: http://arxiv.org/abs/2007.13531v3
- Date: Tue, 30 Mar 2021 17:32:17 GMT
- Title: Learning "What-if" Explanations for Sequential Decision-Making
- Authors: Ioana Bica, Daniel Jarrett, Alihan H\"uy\"uk, Mihaela van der Schaar
- Abstract summary: Building interpretable parameterizations of real-world decision-making on the basis of demonstrated behavior is essential.
We propose learning explanations of expert decisions by modeling their reward function in terms of preferences with respect to "what if" outcomes.
We highlight the effectiveness of our batch, counterfactual inverse reinforcement learning approach in recovering accurate and interpretable descriptions of behavior.
- Score: 92.8311073739295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building interpretable parameterizations of real-world decision-making on the
basis of demonstrated behavior -- i.e. trajectories of observations and actions
made by an expert maximizing some unknown reward function -- is essential for
introspecting and auditing policies in different institutions. In this paper,
we propose learning explanations of expert decisions by modeling their reward
function in terms of preferences with respect to "what if" outcomes: Given the
current history of observations, what would happen if we took a particular
action? To learn these cost-benefit tradeoffs associated with the expert's
actions, we integrate counterfactual reasoning into batch inverse reinforcement
learning. This offers a principled way of defining reward functions and
explaining expert behavior, and also satisfies the constraints of real-world
decision-making -- where active experimentation is often impossible (e.g. in
healthcare). Additionally, by estimating the effects of different actions,
counterfactuals readily tackle the off-policy nature of policy evaluation in
the batch setting, and can naturally accommodate settings where the expert
policies depend on histories of observations rather than just current states.
Through illustrative experiments in both real and simulated medical
environments, we highlight the effectiveness of our batch, counterfactual
inverse reinforcement learning approach in recovering accurate and
interpretable descriptions of behavior.
Related papers
- Learning Causally Invariant Reward Functions from Diverse Demonstrations [6.351909403078771]
Inverse reinforcement learning methods aim to retrieve the reward function of a Markov decision process based on a dataset of expert demonstrations.
This adaptation often exhibits overfitting to the expert data set when a policy is trained on the obtained reward function under distribution shift of the environment dynamics.
In this work, we explore a novel regularization approach for inverse reinforcement learning methods based on the causal invariance principle with the goal of improved reward function generalization.
arXiv Detail & Related papers (2024-09-12T12:56:24Z) - RILe: Reinforced Imitation Learning [60.63173816209543]
RILe is a novel trainer-student system that learns a dynamic reward function based on the student's performance and alignment with expert demonstrations.
RILe enables better performance in complex settings where traditional methods falter, outperforming existing methods by 2x in complex simulated robot-locomotion tasks.
arXiv Detail & Related papers (2024-06-12T17:56:31Z) - Causal Imitation Learning with Unobserved Confounders [82.22545916247269]
We study imitation learning when sensory inputs of the learner and the expert differ.
We show that imitation could still be feasible by exploiting quantitative knowledge of the expert trajectories.
arXiv Detail & Related papers (2022-08-12T13:29:53Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Counterfactual Credit Assignment in Model-Free Reinforcement Learning [47.79277857377155]
Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards.
We adapt the notion of counterfactuals from causality theory to a model-free RL setup.
We formulate a family of policy algorithms that use future-conditional value functions as baselines or critics, and show that they are provably low variance.
arXiv Detail & Related papers (2020-11-18T18:41:44Z) - Fighting Copycat Agents in Behavioral Cloning from Observation Histories [85.404120663644]
Imitation learning trains policies to map from input observations to the actions that an expert would choose.
We propose an adversarial approach to learn a feature representation that removes excess information about the previous expert action nuisance correlate.
arXiv Detail & Related papers (2020-10-28T10:52:10Z) - Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with
Latent Confounders [62.54431888432302]
We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders.
We show how, given only a latent variable model for states and actions, policy value can be identified from off-policy data.
arXiv Detail & Related papers (2020-07-27T22:19:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.