Related papers: Measuring Stochastic Data Complexity with Boltzmann Influence Functions

Measuring Stochastic Data Complexity with Boltzmann Influence Functions

URL: http://arxiv.org/abs/2406.02745v2
Date: Thu, 18 Jul 2024 18:16:59 GMT
Title: Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Authors: Nathan Ng, Roger Grosse, Marzyeh Ghassemi,
Abstract summary: Estimating uncertainty of a model's prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts. We propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.
Score: 12.501336941823627
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Estimating the uncertainty of a model's prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts. A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also consistent with the model and training data. In this work we propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function. IF-COMP can be used to produce well-calibrated predictions on test points as well as measure complexity in both labelled and unlabelled settings. We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.

Related papers

Semiparametric conformal prediction [79.6147286161434]
We construct a conformal prediction set accounting for the joint correlation structure of the vector-valued non-conformity scores. We flexibly estimate the joint cumulative distribution function (CDF) of the scores. Our method yields desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z)
On the good reliability of an interval-based metric to validate prediction uncertainty for machine learning regression tasks [0.0]
This study presents an opportunistic approach to a (more) reliable validation method for prediction uncertainty average calibration. Considering that variance-based calibration metrics are quite sensitive to the presence of heavy tails in the uncertainty and error distributions, a shift is proposed to an interval-based metric, the Prediction Interval Coverage Probability (PICP) The resulting PICPs are more quickly and reliably tested than variance-based calibration metrics.
arXiv Detail & Related papers (2024-08-23T14:16:10Z)
Robust Conformal Prediction under Distribution Shift via Physics-Informed Structural Causal Model [24.58531056536442]
Conformal prediction (CP) handles uncertainty by predicting a set on a test input. This coverage can be guaranteed on test data even if the marginal distributions $P_X$ differ between calibration and test datasets. We propose a physics-informed structural causal model (PI-SCM) to reduce the upper bound.
arXiv Detail & Related papers (2024-03-22T08:13:33Z)
Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample. Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications. We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z)
A Targeted Accuracy Diagnostic for Variational Approximations [8.969208467611896]
Variational Inference (VI) is an attractive alternative to Markov Chain Monte Carlo (MCMC) Existing methods characterize the quality of the whole variational distribution. We propose the TArgeted Diagnostic for Distribution Approximation Accuracy (TADDAA)
arXiv Detail & Related papers (2023-02-24T02:50:18Z)
Theoretical characterization of uncertainty in high-dimensional linear classification [24.073221004661427]
We show that uncertainty for learning from limited number of samples of high-dimensional input data and labels can be obtained by the approximate message passing algorithm. We discuss how over-confidence can be mitigated by appropriately regularising, and show that cross-validating with respect to the loss leads to better calibration than with the 0/1 error.
arXiv Detail & Related papers (2022-02-07T15:32:07Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer. With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices. We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z)
Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters. We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design. A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift. Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z)
Estimation of Accurate and Calibrated Uncertainties in Deterministic models [0.8702432681310401]
We devise a method to transform a deterministic prediction into a probabilistic one. We show that for doing so, one has to compromise between the accuracy and the reliability (calibration) of such a model. We show several examples both with synthetic data, where the underlying hidden noise can accurately be recovered, and with large real-world datasets.
arXiv Detail & Related papers (2020-03-11T04:02:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.