Improving Calibration through the Relationship with Adversarial
Robustness
- URL: http://arxiv.org/abs/2006.16375v2
- Date: Tue, 14 Dec 2021 07:05:11 GMT
- Title: Improving Calibration through the Relationship with Adversarial
Robustness
- Authors: Yao Qin, Xuezhi Wang, Alex Beutel, Ed H. Chi
- Abstract summary: We study the connection between adversarial robustness and calibration.
We propose Adversarial Robustness based Adaptive Labeling (AR-AdaLS)
We find that our method, taking the adversarial robustness of the in-distribution data into consideration, leads to better calibration over the model even under distributional shifts.
- Score: 19.384119330332446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks lack adversarial robustness, i.e., they are vulnerable to
adversarial examples that through small perturbations to inputs cause incorrect
predictions. Further, trust is undermined when models give miscalibrated
predictions, i.e., the predicted probability is not a good indicator of how
much we should trust our model. In this paper, we study the connection between
adversarial robustness and calibration and find that the inputs for which the
model is sensitive to small perturbations (are easily attacked) are more likely
to have poorly calibrated predictions. Based on this insight, we examine if
calibration can be improved by addressing those adversarially unrobust inputs.
To this end, we propose Adversarial Robustness based Adaptive Label Smoothing
(AR-AdaLS) that integrates the correlations of adversarial robustness and
calibration into training by adaptively softening labels for an example based
on how easily it can be attacked by an adversary. We find that our method,
taking the adversarial robustness of the in-distribution data into
consideration, leads to better calibration over the model even under
distributional shifts. In addition, AR-AdaLS can also be applied to an ensemble
model to further improve model calibration.
Related papers
- Towards Certification of Uncertainty Calibration under Adversarial Attacks [96.48317453951418]
We show that attacks can significantly harm calibration, and thus propose certified calibration as worst-case bounds on calibration under adversarial perturbations.
We propose novel calibration attacks and demonstrate how they can improve model calibration through textitadversarial calibration training
arXiv Detail & Related papers (2024-05-22T18:52:09Z) - Confidence-Aware Multi-Field Model Calibration [39.44356123378625]
Field-aware calibration can adjust model output on different feature field values to satisfy fine-grained advertising demands.
We propose a confidence-aware multi-field calibration method, which adaptively adjusts the calibration intensity based on confidence levels derived from sample statistics.
arXiv Detail & Related papers (2024-02-27T16:24:28Z) - Extreme Miscalibration and the Illusion of Adversarial Robustness [66.29268991629085]
Adversarial Training is often used to increase model robustness.
We show that this observed gain in robustness is an illusion of robustness (IOR)
We urge the NLP community to incorporate test-time temperature scaling into their robustness evaluations.
arXiv Detail & Related papers (2024-02-27T13:49:12Z) - Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - Calibrated Selective Classification [34.08454890436067]
We develop a new approach to selective classification in which we propose a method for rejecting examples with "uncertain" uncertainties.
We present a framework for learning selectively calibrated models, where a separate selector network is trained to improve the selective calibration error of a given base model.
We demonstrate the empirical effectiveness of our approach on multiple image classification and lung cancer risk assessment tasks.
arXiv Detail & Related papers (2022-08-25T13:31:09Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - On the (Un-)Avoidability of Adversarial Examples [4.822598110892847]
adversarial examples in deep learning models have caused substantial concern over their reliability.
We provide a framework for determining whether a model's label change under small perturbation is justified.
We prove that our adaptive data-augmentation maintains consistency of 1-nearest neighbor classification under deterministic labels.
arXiv Detail & Related papers (2021-06-24T21:35:25Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.