Confidence and Dispersity Speak: Characterising Prediction Matrix for
Unsupervised Accuracy Estimation
- URL: http://arxiv.org/abs/2302.01094v1
- Date: Thu, 2 Feb 2023 13:30:48 GMT
- Title: Confidence and Dispersity Speak: Characterising Prediction Matrix for
Unsupervised Accuracy Estimation
- Authors: Weijian Deng, Yumin Suh, Stephen Gould, Liang Zheng
- Abstract summary: This work aims to assess how well a model performs under distribution shifts without using labels.
We use the nuclear norm that has been shown to be effective in characterizing both properties.
We show that the nuclear norm is more accurate and robust in accuracy than existing methods.
- Score: 51.809741427975105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work aims to assess how well a model performs under distribution shifts
without using labels. While recent methods study prediction confidence, this
work reports prediction dispersity is another informative cue. Confidence
reflects whether the individual prediction is certain; dispersity indicates how
the overall predictions are distributed across all categories. Our key insight
is that a well-performing model should give predictions with high confidence
and high dispersity. That is, we need to consider both properties so as to make
more accurate estimates. To this end, we use the nuclear norm that has been
shown to be effective in characterizing both properties. Extensive experiments
validate the effectiveness of nuclear norm for various models (e.g., ViT and
ConvNeXt), different datasets (e.g., ImageNet and CUB-200), and diverse types
of distribution shifts (e.g., style shift and reproduction shift). We show that
the nuclear norm is more accurate and robust in accuracy estimation than
existing methods. Furthermore, we validate the feasibility of other
measurements (e.g., mutual information maximization) for characterizing
dispersity and confidence. Lastly, we investigate the limitation of the nuclear
norm, study its improved variant under severe class imbalance, and discuss
potential directions.
Related papers
- Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.
Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.
We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Zero-Shot Uncertainty Quantification using Diffusion Probabilistic Models [7.136205674624813]
We conduct a study to evaluate the effectiveness of ensemble methods on solving different regression problems using diffusion models.
We demonstrate that ensemble methods consistently improve model prediction accuracy across various regression tasks.
Our study provides a comprehensive view of the utility of diffusion ensembles, serving as a useful reference for practitioners employing diffusion models in regression problem-solving.
arXiv Detail & Related papers (2024-08-08T18:34:52Z) - Interpreting Predictive Probabilities: Model Confidence or Human Label
Variation? [27.226997687210044]
We identify two main perspectives that drive starkly different evaluation protocols.
We discuss their merits and limitations, and take the position that both are crucial for trustworthy and fair NLP systems.
We recommend tools and highlight exciting directions towards models with disentangled representations of uncertainty about predictions and uncertainty about human labels.
arXiv Detail & Related papers (2024-02-25T15:00:13Z) - Uncertainty Estimates of Predictions via a General Bias-Variance
Decomposition [7.811916700683125]
We introduce a bias-variance decomposition for proper scores, giving rise to the Bregman Information as the variance term.
We showcase the practical relevance of this decomposition on several downstream tasks, including model ensembles and confidence regions.
arXiv Detail & Related papers (2022-10-21T21:24:37Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - Learning to Predict Trustworthiness with Steep Slope Loss [69.40817968905495]
We study the problem of predicting trustworthiness on real-world large-scale datasets.
We observe that the trustworthiness predictors trained with prior-art loss functions are prone to view both correct predictions and incorrect predictions to be trustworthy.
We propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other.
arXiv Detail & Related papers (2021-09-30T19:19:09Z) - Robust Validation: Confident Predictions Even When Distributions Shift [19.327409270934474]
We describe procedures for robust predictive inference, where a model provides uncertainty estimates on its predictions rather than point predictions.
We present a method that produces prediction sets (almost exactly) giving the right coverage level for any test distribution in an $f$-divergence ball around the training population.
An essential component of our methodology is to estimate the amount of expected future data shift and build robustness to it.
arXiv Detail & Related papers (2020-08-10T17:09:16Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Estimation of Accurate and Calibrated Uncertainties in Deterministic
models [0.8702432681310401]
We devise a method to transform a deterministic prediction into a probabilistic one.
We show that for doing so, one has to compromise between the accuracy and the reliability (calibration) of such a model.
We show several examples both with synthetic data, where the underlying hidden noise can accurately be recovered, and with large real-world datasets.
arXiv Detail & Related papers (2020-03-11T04:02:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.