Revisiting Confidence Estimation: Towards Reliable Failure Prediction
- URL: http://arxiv.org/abs/2403.02886v1
- Date: Tue, 5 Mar 2024 11:44:14 GMT
- Title: Revisiting Confidence Estimation: Towards Reliable Failure Prediction
- Authors: Fei Zhu, Xu-Yao Zhang, Zhen Cheng, Cheng-Lin Liu
- Abstract summary: We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
- Score: 53.79160907725975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reliable confidence estimation is a challenging yet fundamental requirement
in many risk-sensitive applications. However, modern deep neural networks are
often overconfident for their incorrect predictions, i.e., misclassified
samples from known classes, and out-of-distribution (OOD) samples from unknown
classes. In recent years, many confidence calibration and OOD detection methods
have been developed. In this paper, we find a general, widely existing but
actually-neglected phenomenon that most confidence estimation methods are
harmful for detecting misclassification errors. We investigate this problem and
reveal that popular calibration and OOD detection methods often lead to worse
confidence separation between correctly classified and misclassified examples,
making it difficult to decide whether to trust a prediction or not. Finally, we
propose to enlarge the confidence gap by finding flat minima, which yields
state-of-the-art failure prediction performance under various settings
including balanced, long-tailed, and covariate-shift classification scenarios.
Our study not only provides a strong baseline for reliable confidence
estimation but also acts as a bridge between understanding calibration, OOD
detection, and failure prediction. The code is available at
\url{https://github.com/Impression2805/FMFP}.
Related papers
- Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Calibrating Multimodal Learning [94.65232214643436]
We propose a novel regularization technique, i.e., Calibrating Multimodal Learning (CML) regularization, to calibrate the predictive confidence of previous methods.
This technique could be flexibly equipped by existing models and improve the performance in terms of confidence calibration, classification accuracy, and model robustness.
arXiv Detail & Related papers (2023-06-02T04:29:57Z) - Rethinking Confidence Calibration for Failure Prediction [37.43981354073841]
Modern deep neural networks are often overconfident for their incorrect predictions.
We find that most confidence calibration methods are useless or harmful for failure prediction.
We propose a simple hypothesis: flat minima is beneficial for failure prediction.
arXiv Detail & Related papers (2023-03-06T08:54:18Z) - Calibrating Deep Neural Networks using Explicit Regularisation and
Dynamic Data Pruning [25.982037837953268]
Deep neural networks (DNN) are prone to miscalibrated predictions, often exhibiting a mismatch between the predicted output and the associated confidence scores.
We propose a novel regularization technique that can be used with classification losses, leading to state-of-the-art calibrated predictions at test time.
arXiv Detail & Related papers (2022-12-20T05:34:58Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Uncertainty-Aware Reliable Text Classification [21.517852608625127]
Deep neural networks have significantly contributed to the success in predictive accuracy for classification tasks.
They tend to make over-confident predictions in real-world settings, where domain shifting and out-of-distribution examples exist.
We propose an inexpensive framework that adopts both auxiliary outliers and pseudo off-manifold samples to train the model with prior knowledge of a certain class.
arXiv Detail & Related papers (2021-07-15T04:39:55Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z) - Harnessing Adversarial Distances to Discover High-Confidence Errors [0.0]
We investigate the problem of finding errors at rates greater than expected given model confidence.
We propose a query-efficient and novel search technique that is guided by adversarial perturbations.
arXiv Detail & Related papers (2020-06-29T13:44:16Z) - Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via
Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals.
The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z) - An Empirical Evaluation on Robustness and Uncertainty of Regularization
Methods [43.25086015530892]
Deep neural networks (DNNs) behave fundamentally differently from humans.
They can easily change predictions when small corruptions such as blur are applied on the input.
They produce confident predictions on out-of-distribution samples (improper uncertainty measure)
arXiv Detail & Related papers (2020-03-09T01:15:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.