Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction
- URL: http://arxiv.org/abs/2304.10127v2
- Date: Mon, 30 Oct 2023 13:49:36 GMT
- Title: Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction
- Authors: Peng Cui, Dan Zhang, Zhijie Deng, Yinpeng Dong, Jun Zhu
- Abstract summary: We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
- Score: 55.77136037458667
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale pre-trained models have achieved remarkable success in many
applications, but how to leverage them to improve the prediction reliability of
downstream models is undesirably under-explored. Moreover, modern neural
networks have been found to be poorly calibrated and make overconfident
predictions regardless of inherent sample difficulty and data uncertainty. To
address this issue, we propose to utilize large-scale pre-trained models to
guide downstream model training with sample difficulty-aware entropy
regularization. Pre-trained models that have been exposed to large-scale
datasets and do not overfit the downstream training classes enable us to
measure each training sample's difficulty via feature-space Gaussian modeling
and relative Mahalanobis distance computation. Importantly, by adaptively
penalizing overconfident prediction based on the sample difficulty, we
simultaneously improve accuracy and uncertainty calibration across challenging
benchmarks (e.g., +0.55% ACC and -3.7% ECE on ImageNet1k using ResNet34),
consistently surpassing competitive baselines for reliable prediction. The
improved uncertainty estimate further improves selective classification
(abstaining from erroneous predictions) and out-of-distribution detection.
Related papers
- Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Multiclass Alignment of Confidence and Certainty for Network Calibration [10.15706847741555]
Recent studies reveal that deep neural networks (DNNs) are prone to making overconfident predictions.
We propose a new train-time calibration method, which features a simple, plug-and-play auxiliary loss known as multi-class alignment of predictive mean confidence and predictive certainty (MACC)
Our method achieves state-of-the-art calibration performance for both in-domain and out-domain predictions.
arXiv Detail & Related papers (2023-09-06T00:56:24Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Calibrating Deep Neural Networks using Explicit Regularisation and
Dynamic Data Pruning [25.982037837953268]
Deep neural networks (DNN) are prone to miscalibrated predictions, often exhibiting a mismatch between the predicted output and the associated confidence scores.
We propose a novel regularization technique that can be used with classification losses, leading to state-of-the-art calibrated predictions at test time.
arXiv Detail & Related papers (2022-12-20T05:34:58Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Data Uncertainty without Prediction Models [0.8223798883838329]
We propose an uncertainty estimation method named a Distance-weighted Class Impurity without explicit use of prediction models.
We verified that the Distance-weighted Class Impurity works effectively regardless of prediction models.
arXiv Detail & Related papers (2022-04-25T13:26:06Z) - Uncertainty-Aware Time-to-Event Prediction using Deep Kernel Accelerated
Failure Time Models [11.171712535005357]
We propose Deep Kernel Accelerated Failure Time models for the time-to-event prediction task.
Our model shows better point estimate performance than recurrent neural network based baselines in experiments on two real-world datasets.
arXiv Detail & Related papers (2021-07-26T14:55:02Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Learning Prediction Intervals for Model Performance [1.433758865948252]
We propose a method to compute prediction intervals for model performance.
We evaluate our approach across a wide range of drift conditions and show substantial improvement over competitive baselines.
arXiv Detail & Related papers (2020-12-15T21:32:03Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.