Generalized Adversarial Distances to Efficiently Discover Classifier
Errors
- URL: http://arxiv.org/abs/2102.12844v1
- Date: Thu, 25 Feb 2021 13:31:21 GMT
- Title: Generalized Adversarial Distances to Efficiently Discover Classifier
Errors
- Authors: Walter Bennette, Sally Dufek, Karsten Maurer, Sean Sisti, Bunyod
Tusmatov
- Abstract summary: High-confidence errors are rare events for which the model is highly confident in its prediction, but is wrong.
We propose a generalization to the Adversarial Distance search that leverages concepts from adversarial machine learning.
Experimental results show that the generalized method finds errors at rates greater than expected given the confidence of the sampled predictions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a black-box classification model and an unlabeled evaluation dataset
from some application domain, efficient strategies need to be developed to
evaluate the model. Random sampling allows a user to estimate metrics like
accuracy, precision, and recall, but may not provide insight to high-confidence
errors. High-confidence errors are rare events for which the model is highly
confident in its prediction, but is wrong. Such errors can represent costly
mistakes and should be explicitly searched for. In this paper we propose a
generalization to the Adversarial Distance search that leverages concepts from
adversarial machine learning to identify predictions for which a classifier may
be overly confident. These predictions are useful instances to sample when
looking for high-confidence errors because they are prone to a higher rate of
error than expected. Our generalization allows Adversarial Distance to be
applied to any classifier or data domain. Experimental results show that the
generalized method finds errors at rates greater than expected given the
confidence of the sampled predictions, and outperforms competing methods.
Related papers
- Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - Calibrating Deep Neural Networks using Explicit Regularisation and
Dynamic Data Pruning [25.982037837953268]
Deep neural networks (DNN) are prone to miscalibrated predictions, often exhibiting a mismatch between the predicted output and the associated confidence scores.
We propose a novel regularization technique that can be used with classification losses, leading to state-of-the-art calibrated predictions at test time.
arXiv Detail & Related papers (2022-12-20T05:34:58Z) - A Statistical Model for Predicting Generalization in Few-Shot
Classification [6.158812834002346]
We introduce a Gaussian model of the feature distribution to predict the generalization error.
We show that our approach outperforms alternatives such as the leave-one-out cross-validation strategy.
arXiv Detail & Related papers (2022-12-13T10:21:15Z) - Calibrated Selective Classification [34.08454890436067]
We develop a new approach to selective classification in which we propose a method for rejecting examples with "uncertain" uncertainties.
We present a framework for learning selectively calibrated models, where a separate selector network is trained to improve the selective calibration error of a given base model.
We demonstrate the empirical effectiveness of our approach on multiple image classification and lung cancer risk assessment tasks.
arXiv Detail & Related papers (2022-08-25T13:31:09Z) - Learning to Predict Trustworthiness with Steep Slope Loss [69.40817968905495]
We study the problem of predicting trustworthiness on real-world large-scale datasets.
We observe that the trustworthiness predictors trained with prior-art loss functions are prone to view both correct predictions and incorrect predictions to be trustworthy.
We propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other.
arXiv Detail & Related papers (2021-09-30T19:19:09Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Harnessing Adversarial Distances to Discover High-Confidence Errors [0.0]
We investigate the problem of finding errors at rates greater than expected given model confidence.
We propose a query-efficient and novel search technique that is guided by adversarial perturbations.
arXiv Detail & Related papers (2020-06-29T13:44:16Z) - Individual Calibration with Randomized Forecasting [116.2086707626651]
We show that calibration for individual samples is possible in the regression setup if the predictions are randomized.
We design a training objective to enforce individual calibration and use it to train randomized regression functions.
arXiv Detail & Related papers (2020-06-18T05:53:10Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.