Bayesian Decision Making around Experts
- URL: http://arxiv.org/abs/2510.08113v1
- Date: Thu, 09 Oct 2025 11:53:19 GMT
- Title: Bayesian Decision Making around Experts
- Authors: Daniel Jarne Ornia, Joel Dyer, Nicholas Bishop, Anisoara Calinescu, Michael Wooldridge,
- Abstract summary: We formalize how expert data influences the learner's posterior, and prove that pretraining on expert outcomes tightens information-theoretic regret bounds.<n>By quantifying the value of expert data, our framework provides practical, information-theoretic algorithms for agents to intelligently decide when to learn from others.
- Score: 3.1764800782234297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Complex learning agents are increasingly deployed alongside existing experts, such as human operators or previously trained agents. However, it remains unclear how should learners optimally incorporate certain forms of expert data, which may differ in structure from the learner's own action-outcome experiences. We study this problem in the context of Bayesian multi-armed bandits, considering: (i) offline settings, where the learner receives a dataset of outcomes from the expert's optimal policy before interaction, and (ii) simultaneous settings, where the learner must choose at each step whether to update its beliefs based on its own experience, or based on the outcome simultaneously achieved by an expert. We formalize how expert data influences the learner's posterior, and prove that pretraining on expert outcomes tightens information-theoretic regret bounds by the mutual information between the expert data and the optimal action. For the simultaneous setting, we propose an information-directed rule where the learner processes the data source that maximizes their one-step information gain about the optimal action. Finally, we propose strategies for how the learner can infer when to trust the expert and when not to, safeguarding the learner for the cases where the expert is ineffective or compromised. By quantifying the value of expert data, our framework provides practical, information-theoretic algorithms for agents to intelligently decide when to learn from others.
Related papers
- What makes an Expert? Comparing Problem-solving Practices in Data Science Notebooks [0.6308539010172308]
Development of data science expertise requires tacit, process-oriented skills that are difficult to teach directly.<n>This study addresses the resulting challenge of empirically understanding how the problem-solving processes of experts and novices differ.
arXiv Detail & Related papers (2026-02-17T08:45:23Z) - Imitation Learning for Combinatorial Optimisation under Uncertainty [1.0781866671930855]
This paper introduces a systematic taxonomy of experts for IL optimisation under uncertainty.<n>Experts are classified along three dimensions: (i) their treatment of uncertainty, including myopic, deterministic, full-information, two-stage, and multi-stage formulations; (ii) their level of optimality, distinguishing task-optimal and approximate experts; and (iii) their interaction mode with the learner, ranging from one-shot supervision to iterative, interactive schemes.
arXiv Detail & Related papers (2026-01-08T21:16:25Z) - Learning to Defer for Causal Discovery with Imperfect Experts [59.071731337922664]
We propose L2D-CD, a method for gauging the correctness of expert recommendations and optimally combining them with data-driven causal discovery results.<n>We evaluate L2D-CD on the canonical T"ubingen pairs dataset and demonstrate its superior performance compared to both the causal discovery method and the expert used in isolation.
arXiv Detail & Related papers (2025-02-18T18:55:53Z) - Learning More Generalized Experts by Merging Experts in Mixture-of-Experts [0.5221459608786241]
We show that incorporating a shared layer in a mixture-of-experts can lead to performance degradation.
We merge the two most frequently selected experts and update the least frequently selected expert using the combination of experts.
Our algorithm enhances transfer learning and mitigates catastrophic forgetting when applied to multi-domain task incremental learning.
arXiv Detail & Related papers (2024-05-19T11:55:48Z) - Defining Expertise: Applications to Treatment Effect Estimation [58.7977683502207]
We argue that expertise - particularly the type of expertise the decision-makers of a domain are likely to have - can be informative in designing and selecting methods for treatment effect estimation.
We define two types of expertise, predictive and prognostic, and demonstrate empirically that: (i) the prominent type of expertise in a domain significantly influences the performance of different methods in treatment effect estimation, and (ii) it is possible to predict the type of expertise present in a dataset.
arXiv Detail & Related papers (2024-03-01T17:30:49Z) - Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z) - Causal Discovery with Language Models as Imperfect Experts [119.22928856942292]
We consider how expert knowledge can be used to improve the data-driven identification of causal graphs.
We propose strategies for amending such expert knowledge based on consistency properties.
We report a case study, on real data, where a large language model is used as an imperfect expert.
arXiv Detail & Related papers (2023-07-05T16:01:38Z) - A Machine Learning Framework Towards Transparency in Experts' Decision
Quality [0.0]
In many important settings, transparency in experts' decision quality is rarely possible because ground truth data for evaluating the experts' decisions is costly and available only for a limited set of decisions.
We first formulate the problem of estimating experts' decision accuracy in this setting and then develop a machine-learning-based framework to address it.
Our method effectively leverages both abundant historical data on workers' past decisions, and scarce decision instances with ground truth information.
arXiv Detail & Related papers (2021-10-21T18:50:40Z) - Online Learning with Uncertain Feedback Graphs [12.805267089186533]
The relationship among experts can be captured by a feedback graph, which can be used to assist the learner's decision making.
In practice, the nominal feedback graph often entails uncertainties, which renders it impossible to reveal the actual relationship among experts.
The present work studies various cases of potential uncertainties, and develops novel online learning algorithms to deal with them.
arXiv Detail & Related papers (2021-06-15T21:21:30Z) - Learning without Knowing: Unobserved Context in Continuous Transfer
Reinforcement Learning [16.814772057210366]
We consider a transfer Reinforcement Learning problem in continuous state and action spaces under unobserved contextual information.
Our goal is to use the context-aware expert data to learn an optimal context-unaware policy for the learner using only a few new data samples.
arXiv Detail & Related papers (2021-06-07T17:49:22Z) - Decision Rule Elicitation for Domain Adaptation [93.02675868486932]
Human-in-the-loop machine learning is widely used in artificial intelligence (AI) to elicit labels from experts.
In this work, we allow experts to additionally produce decision rules describing their decision-making.
We show that decision rule elicitation improves domain adaptation of the algorithm and helps to propagate expert's knowledge to the AI model.
arXiv Detail & Related papers (2021-02-23T08:07:22Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.