Related papers: How to Fix a Broken Confidence Estimator: Evaluating Post-hoc Methods for Selective Classification with Deep Neural Networks

How to Fix a Broken Confidence Estimator: Evaluating Post-hoc Methods for Selective Classification with Deep Neural Networks

URL: http://arxiv.org/abs/2305.15508v4
Date: Fri, 24 May 2024 17:48:11 GMT
Title: How to Fix a Broken Confidence Estimator: Evaluating Post-hoc Methods for Selective Classification with Deep Neural Networks
Authors: Luís Felipe P. Cattelan, Danilo Silva,
Abstract summary: We show that a simple $p$-norm normalization of the logits, followed by taking the maximum logit as the confidence estimator, can lead to considerable gains in selective classification performance. Our results are shown to be consistent under distribution shift.
Score: 1.4502611532302039
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper addresses the problem of selective classification for deep neural networks, where a model is allowed to abstain from low-confidence predictions to avoid potential errors. We focus on so-called post-hoc methods, which replace the confidence estimator of a given classifier without modifying or retraining it, thus being practically appealing. Considering neural networks with softmax outputs, our goal is to identify the best confidence estimator that can be computed directly from the unnormalized logits. This problem is motivated by the intriguing observation in recent work that many classifiers appear to have a "broken" confidence estimator, in the sense that their selective classification performance is much worse than what could be expected by their corresponding accuracies. We perform an extensive experimental study of many existing and proposed confidence estimators applied to 84 pretrained ImageNet classifiers available from popular repositories. Our results show that a simple $p$-norm normalization of the logits, followed by taking the maximum logit as the confidence estimator, can lead to considerable gains in selective classification performance, completely fixing the pathological behavior observed in many classifiers. As a consequence, the selective classification performance of any classifier becomes almost entirely determined by its corresponding accuracy. Moreover, these results are shown to be consistent under distribution shift. Our code is available at https://github.com/lfpc/FixSelectiveClassification.

Related papers

Know When to Abstain: Optimal Selective Classification with Likelihood Ratios [10.317060648446828]
We revisit the design of optimal selection functions through the lens of the Neyman-Pearson lemma.<n>We show that this perspective unifies the behavior of several post-hoc selection baselines, and also motivates new approaches to selective classification.<n>We evaluate our proposed methods across a range of vision and language tasks, including both supervised learning and vision-language models.
arXiv Detail & Related papers (2025-05-21T01:26:21Z)
Fixed Random Classifier Rearrangement for Continual Learning [0.5439020425819]
In visual classification scenario, neural networks inevitably forget the knowledge of old tasks after learning new ones. We propose a continual learning algorithm named Fixed Random Rearrangement (FRCR)
arXiv Detail & Related papers (2024-02-23T09:43:58Z)
The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks. We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs. Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification [86.32752788233913]
In classification problems, the Bayes error can be used as a criterion to evaluate classifiers with state-of-the-art performance. We propose a simple and direct Bayes error estimator, where we just take the mean of the labels that show emphuncertainty of the classes. Our flexible approach enables us to perform Bayes error estimation even for weakly supervised data.
arXiv Detail & Related papers (2022-02-01T13:22:26Z)
Predicting Classification Accuracy When Adding New Unobserved Classes [8.325327265120283]
We study how a classifier's performance can be used to extrapolate its expected accuracy on a larger, unobserved set of classes. We formulate a robust neural-network-based algorithm, "CleaneX", which learns to estimate the accuracy of such classifiers on arbitrarily large sets of classes.
arXiv Detail & Related papers (2020-10-28T14:37:25Z)
Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model [20.948038514886377]
This paper presents a new framework that produces a quantitative metric for detecting misclassification errors. The framework, RED, builds an error detector on top of the base classifier and estimates uncertainty of the detection scores using Gaussian Processes.
arXiv Detail & Related papers (2020-10-05T15:01:30Z)
Certifying Confidence via Randomized Smoothing [151.67113334248464]
Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems. Most smoothing methods do not give us any information about the confidence with which the underlying classifier makes a prediction. We propose a method to generate certified radii for the prediction confidence of the smoothed classifier.
arXiv Detail & Related papers (2020-09-17T04:37:26Z)
Revisiting One-vs-All Classifiers for Predictive Uncertainty and Out-of-Distribution Detection in Neural Networks [22.34227625637843]
We investigate how the parametrization of the probabilities in discriminative classifiers affects the uncertainty estimates. We show that one-vs-all formulations can improve calibration on image classification tasks.
arXiv Detail & Related papers (2020-07-10T01:55:02Z)
Consistency Regularization for Certified Robustness of Smoothed Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z)
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks. We present a unifying view of randomized smoothing over arbitrary functions. We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.