Related papers: Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data

Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data

URL: http://arxiv.org/abs/2412.07520v1
Date: Tue, 10 Dec 2024 13:58:19 GMT
Title: Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data
Authors: Koby Bibas,
Abstract summary: This study investigates pNML's learnability for linear regression and neural networks.<n>It demonstrates that pNML can improve the performance and robustness of these models on various tasks.
Score: 2.1248439796866228
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning models have exhibited exceptional results in various domains. The most prevalent approach for learning is the empirical risk minimizer (ERM), which adapts the model's weights to reduce the loss on a training set and subsequently leverages these weights to predict the label for new test data. Nonetheless, ERM makes the assumption that the test distribution is similar to the training distribution, which may not always hold in real-world situations. In contrast, the predictive normalized maximum likelihood (pNML) was proposed as a min-max solution for the individual setting where no assumptions are made on the distribution of the tested input. This study investigates pNML's learnability for linear regression and neural networks, and demonstrates that pNML can improve the performance and robustness of these models on various tasks. Moreover, the pNML provides an accurate confidence measure for its output, showcasing state-of-the-art results for out-of-distribution detection, resistance to adversarial attacks, and active learning.

Related papers

Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments [5.5855749614100825]
This paper addresses the hypothesis that leveraging multiple pre-trained models can mitigate this recall reduction.<n>We formulate the challenge of identifying and managing conflicting predictions from various models as a consistency-based abduction problem.<n>Our results validate the use of consistency-based abduction as an effective mechanism to robustly integrate knowledge from multiple imperfect models in challenging, novel scenarios.
arXiv Detail & Related papers (2025-05-25T23:17:47Z)
Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data. One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is. This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z)
Variance of ML-based software fault predictors: are we really improving fault prediction? [0.3222802562733786]
We experimentally analyze the variance of a state-of-the-art fault prediction approach. We observed a maximum variance of 10.10% in terms of the per-class accuracy metric.
arXiv Detail & Related papers (2023-10-26T09:31:32Z)
Machine Learning Data Suitability and Performance Testing Using Fault Injection Testing Framework [0.0]
This paper presents the Fault Injection for Undesirable Learning in input Data (FIUL-Data) testing framework. Data mutators explore vulnerabilities of ML systems against the effects of different fault injections. This paper evaluates the framework using data from analytical chemistry, comprising retention time measurements of anti-sense oligonucleotides.
arXiv Detail & Related papers (2023-09-20T12:58:35Z)
Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method. We develop practical bounds to apply it to language generation. We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z)
On the Robustness of Random Forest Against Untargeted Data Poisoning: An Ensemble-Based Approach [42.81632484264218]
In machine learning models, perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy. This paper aims to implement a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks.
arXiv Detail & Related papers (2022-09-28T11:41:38Z)
Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next. In such settings, there is a distinct type of distribution shift between the training and test data. We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
Out-of-distribution detection for regression tasks: parameter versus predictor entropy [2.026281591452464]
It is crucial to detect when an instance lies downright too far from the training samples for the machine learning model to be trusted. For neural networks, one approach to this task consists of learning a diversity of predictors that all can explain the training data. We propose a new way of estimating the entropy of a distribution on predictors based on nearest neighbors in function space.
arXiv Detail & Related papers (2020-10-24T21:41:21Z)
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer [40.63693071222628]
We study the minimum word error rate (MWER) training of Hybrid Autoregressive Transducer (HAT) From experiments with around 30,000 hours of training data, we show that MWER training can improve the accuracy of HAT models.
arXiv Detail & Related papers (2020-10-23T21:16:30Z)
Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation. Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z)
A comprehensive study on the prediction reliability of graph neural networks for virtual screening [0.0]
We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results. Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate.
arXiv Detail & Related papers (2020-03-17T10:13:31Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.