Explaining by Imitating: Understanding Decisions by Interpretable Policy
Learning
- URL: http://arxiv.org/abs/2310.19831v1
- Date: Sat, 28 Oct 2023 13:06:14 GMT
- Title: Explaining by Imitating: Understanding Decisions by Interpretable Policy
Learning
- Authors: Alihan H\"uy\"uk, Daniel Jarrett, Mihaela van der Schaar
- Abstract summary: Understanding human behavior from observed data is critical for transparency and accountability in decision-making.
Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging.
We propose a data-driven representation of decision-making behavior that inheres transparency by design, accommodates partial observability, and operates completely offline.
- Score: 72.80902932543474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding human behavior from observed data is critical for transparency
and accountability in decision-making. Consider real-world settings such as
healthcare, in which modeling a decision-maker's policy is challenging -- with
no access to underlying states, no knowledge of environment dynamics, and no
allowance for live experimentation. We desire learning a data-driven
representation of decision-making behavior that (1) inheres transparency by
design, (2) accommodates partial observability, and (3) operates completely
offline. To satisfy these key criteria, we propose a novel model-based Bayesian
method for interpretable policy learning ("Interpole") that jointly estimates
an agent's (possibly biased) belief-update process together with their
(possibly suboptimal) belief-action mapping. Through experiments on both
simulated and real-world data for the problem of Alzheimer's disease diagnosis,
we illustrate the potential of our approach as an investigative device for
auditing, quantifying, and understanding human decision-making behavior.
Related papers
- Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z) - Counterfactual Prediction Under Selective Confounding [3.6860485638625673]
This research addresses the challenge of conducting causal inference between a binary treatment and its resulting outcome when not all confounders are known.
We relax the requirement of knowing all confounders under desired treatment, which we refer to as Selective Confounding.
We provide both theoretical error bounds and empirical evidence of the effectiveness of our proposed scheme using synthetic and real-world child placement data.
arXiv Detail & Related papers (2023-10-21T16:54:59Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Inverse Contextual Bandits: Learning How Behavior Evolves over Time [89.59391124399927]
We seek an approach to policy learning that provides interpretable representations of decision-making.
First, we model the behavior of learning agents in terms of contextual bandits, and formalize the problem of inverse contextual bandits (ICB)
Second, we propose two algorithms to tackle ICB, each making varying degrees of assumptions regarding the agent's learning strategy.
arXiv Detail & Related papers (2021-07-13T18:24:18Z) - Uncertainty as a Form of Transparency: Measuring, Communicating, and
Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions.
We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems.
This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z) - Off-policy Policy Evaluation For Sequential Decisions Under Unobserved
Confounding [33.58862183373374]
We assess robustness of OPE methods under unobserved confounding.
We show that even small amounts of per-decision confounding can heavily bias OPE methods.
We propose an efficient loss-minimization-based procedure for computing worst-case bounds.
arXiv Detail & Related papers (2020-03-12T05:20:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.