Online Decision Mediation
- URL: http://arxiv.org/abs/2310.18601v1
- Date: Sat, 28 Oct 2023 05:59:43 GMT
- Title: Online Decision Mediation
- Authors: Daniel Jarrett, Alihan H\"uy\"uk, Mihaela van der Schaar
- Abstract summary: Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
- Score: 72.80902932543474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consider learning a decision support assistant to serve as an intermediary
between (oracle) expert behavior and (imperfect) human behavior: At each time,
the algorithm observes an action chosen by a fallible agent, and decides
whether to *accept* that agent's decision, *intervene* with an alternative, or
*request* the expert's opinion. For instance, in clinical diagnosis,
fully-autonomous machine behavior is often beyond ethical affordances, thus
real-world decision support is often limited to monitoring and forecasting.
Instead, such an intermediary would strike a prudent balance between the former
(purely prescriptive) and latter (purely descriptive) approaches, while
providing an efficient interface between human mistakes and expert feedback. In
this work, we first formalize the sequential problem of *online decision
mediation* -- that is, of simultaneously learning and evaluating mediator
policies from scratch with *abstentive feedback*: In each round, deferring to
the oracle obviates the risk of error, but incurs an upfront penalty, and
reveals the otherwise hidden expert action as a new training data point.
Second, we motivate and propose a solution that seeks to trade off (immediate)
loss terms against (future) improvements in generalization error; in doing so,
we identify why conventional bandit algorithms may fail. Finally, through
experiments and sensitivities on a variety of datasets, we illustrate
consistent gains over applicable benchmarks on performance measures with
respect to the mediator policy, the learned model, and the decision-making
system as a whole.
Related papers
- Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks [6.520709313101523]
This work investigates cognitive bias identification in high-stake decision making process by human experts.
We propose bias-aware AI-augmented workflow that surpass human judgment.
In our experiments, both the proposed model and the agentic workflow significantly improves on both human judgment and alternative models.
arXiv Detail & Related papers (2024-11-13T10:42:11Z) - Early stopping by correlating online indicators in neural networks [0.24578723416255746]
We propose a novel technique to identify overfitting phenomena when training the learner.
Our proposal exploits the correlation over time in a collection of online indicators.
As opposed to previous approaches focused on a single criterion, we take advantage of subsidiarities between independent assessments.
arXiv Detail & Related papers (2024-02-04T14:57:20Z) - Explaining by Imitating: Understanding Decisions by Interpretable Policy
Learning [72.80902932543474]
Understanding human behavior from observed data is critical for transparency and accountability in decision-making.
Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging.
We propose a data-driven representation of decision-making behavior that inheres transparency by design, accommodates partial observability, and operates completely offline.
arXiv Detail & Related papers (2023-10-28T13:06:14Z) - Setting the Right Expectations: Algorithmic Recourse Over Time [16.930905275894183]
We propose an agent-based simulation framework for studying the effects of a continuously changing environment on algorithmic recourse.
Our findings highlight that only a small set of specific parameterizations result in algorithmic recourse that is reliable for agents over time.
arXiv Detail & Related papers (2023-09-13T14:04:15Z) - Pure Exploration under Mediators' Feedback [63.56002444692792]
Multi-armed bandits are a sequential-decision-making framework, where, at each interaction step, the learner selects an arm and observes a reward.
We consider the scenario in which the learner has access to a set of mediators, each of which selects the arms on the agent's behalf according to a and possibly unknown policy.
We propose a sequential decision-making strategy for discovering the best arm under the assumption that the mediators' policies are known to the learner.
arXiv Detail & Related papers (2023-08-29T18:18:21Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Dealing with Expert Bias in Collective Decision-Making [4.588028371034406]
We propose a new algorithmic approach based on contextual multi-armed bandit problems (CMAB) to identify and counteract biased expertises.
Our novel CMAB-inspired approach achieves a higher final performance and does so while converging more rapidly than previous adaptive algorithms.
arXiv Detail & Related papers (2021-06-25T10:17:37Z) - End-to-End Learning and Intervention in Games [60.41921763076017]
We provide a unified framework for learning and intervention in games.
We propose two approaches, respectively based on explicit and implicit differentiation.
The analytical results are validated using several real-world problems.
arXiv Detail & Related papers (2020-10-26T18:39:32Z) - A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous
Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions.
We first show that humans do alter their behavior when the tool is deployed.
We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.