Related papers: Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration

Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration

URL: http://arxiv.org/abs/2109.14591v2
Date: Fri, 1 Oct 2021 17:19:52 GMT
Title: Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration
Authors: Gavin Kerrigan, Padhraic Smyth, Mark Steyvers
Abstract summary: We develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence.
Score: 11.75395256889808
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human or model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints.

Related papers

Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z)
Embedding-based statistical inference on generative models [10.948308354932639]
We extend results related to embedding-based representations of generative models to classical statistical inference settings. We demonstrate that using the perspective space as the basis of a notion of "similar" is effective for multiple model-level inference tasks.
arXiv Detail & Related papers (2024-10-01T22:28:39Z)
Human Trajectory Forecasting with Explainable Behavioral Uncertainty [63.62824628085961]
Human trajectory forecasting helps to understand and predict human behaviors, enabling applications from social robots to self-driving cars. Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well. We show that BNSP-SFM achieves up to a 50% improvement in prediction accuracy, compared with 11 state-of-the-art methods.
arXiv Detail & Related papers (2023-07-04T16:45:21Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models. We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization. We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z)
Stochastic Parameterizations: Better Modelling of Temporal Correlations using Probabilistic Machine Learning [1.5293427903448025]
We show that by using a physically-informed recurrent neural network within a probabilistic framework, our model for the 96 atmospheric simulation is competitive. This is due to a superior ability to model temporal correlations compared to standard first-order autoregressive schemes. We evaluate across a number of metrics from the literature, but also discuss how the probabilistic metric of likelihood may be a unifying choice for future climate models.
arXiv Detail & Related papers (2022-03-28T14:51:42Z)
A Relational Model for One-Shot Classification [80.77724423309184]
We show that a deep learning model with built-in inductive bias can bring benefits to sample-efficient learning, without relying on extensive data augmentation. The proposed one-shot classification model performs relational matching of a pair of inputs in the form of local and pairwise attention.
arXiv Detail & Related papers (2021-11-08T07:53:12Z)
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? [0.2836066255205732]
We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models. We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin. We also found that deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts.
arXiv Detail & Related papers (2021-07-24T11:38:25Z)
Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters. We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z)
Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability. We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z)
PSD2 Explainable AI Model for Credit Scoring [0.0]
The aim of this project is to develop and test advanced analytical methods to improve the prediction accuracy of Credit Risk Models. The project focuses on applying an explainable machine learning model to bank-related databases.
arXiv Detail & Related papers (2020-11-20T12:12:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.