Bayesian Inference for Correlated Human Experts and Classifiers
- URL: http://arxiv.org/abs/2506.05636v1
- Date: Thu, 05 Jun 2025 23:39:41 GMT
- Title: Bayesian Inference for Correlated Human Experts and Classifiers
- Authors: Markelle Kelly, Alex Boyd, Sam Showalter, Mark Steyvers, Padhraic Smyth,
- Abstract summary: We investigate the problem of querying experts for class label predictions using as few human queries as possible.<n>We develop a general Bayesian framework for this problem, modeling expert correlation via a joint latent representation.<n>We apply our approach to two real-world medical classification problems, as well as to CIFAR-10H and ImageNet-16H.
- Score: 15.98474253538141
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Applications of machine learning often involve making predictions based on both model outputs and the opinions of human experts. In this context, we investigate the problem of querying experts for class label predictions, using as few human queries as possible, and leveraging the class probability estimates of pre-trained classifiers. We develop a general Bayesian framework for this problem, modeling expert correlation via a joint latent representation, enabling simulation-based inference about the utility of additional expert queries, as well as inference of posterior distributions over unobserved expert labels. We apply our approach to two real-world medical classification problems, as well as to CIFAR-10H and ImageNet-16H, demonstrating substantial reductions relative to baselines in the cost of querying human experts while maintaining high prediction accuracy.
Related papers
- Validation of Conformal Prediction in Cervical Atypia Classification [1.8988964758950546]
deep learning based cervical cancer classification can potentially increase access to screening in low-resource regions.<n>Deep learning models are often overconfident and do not reliably reflect diagnostic uncertainty.<n>Con conformal prediction is a model-agnostic framework for generating prediction sets that contain likely classes for trained deep-learning models.
arXiv Detail & Related papers (2025-05-13T14:37:58Z) - In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.<n>Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z) - Query Performance Prediction using Relevance Judgments Generated by Large Language Models [53.97064615557883]
We propose a new Query performance prediction (QPP) framework using automatically generated relevance judgments (QPP-GenRE)<n>QPP-GenRE decomposes QPP into independent subtasks of predicting relevance of each item in a ranked list to a given query.<n>We predict an item's relevance by using open-source large language models (LLMs) to ensure scientific relevance.
arXiv Detail & Related papers (2024-04-01T09:33:05Z) - Bayesian Online Learning for Consensus Prediction [16.890828000688174]
We propose a family of methods that dynamically estimate expert consensus from partial feedback.
We demonstrate the efficacy of our framework against a variety of baselines on CIFAR-10H and ImageNet-16H.
arXiv Detail & Related papers (2023-12-12T19:18:04Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Incorporating Experts' Judgment into Machine Learning Models [2.5363839239628843]
In some cases, domain experts might have a judgment about the expected outcome that might conflict with the prediction of machine learning models.
We present a novel framework that aims at leveraging experts' judgment to mitigate the conflict.
arXiv Detail & Related papers (2023-04-24T07:32:49Z) - On the Representation Collapse of Sparse Mixture of Experts [102.83396489230375]
Sparse mixture of experts provides larger model capacity while requiring a constant computational overhead.
It employs the routing mechanism to distribute input tokens to the best-matched experts according to their hidden representations.
However, learning such a routing mechanism encourages token clustering around expert centroids, implying a trend toward representation collapse.
arXiv Detail & Related papers (2022-04-20T01:40:19Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Healing Products of Gaussian Processes [21.892542043785845]
We propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter.
In particular, we propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter.
arXiv Detail & Related papers (2021-02-14T08:53:43Z) - Gaussian Experts Selection using Graphical Models [7.530615321587948]
Local approximations reduce time complexity by dividing the original dataset into subsets and training a local expert on each subset.
We leverage techniques from the literature on undirected graphical models, using sparse precision matrices that encode conditional dependencies between experts to select the most important experts.
arXiv Detail & Related papers (2021-02-02T14:12:11Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models.
Standard metrics calculated from retrospective data are only related to model utility under certain assumptions.
When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.