Predicting Out-of-Distribution Error with the Projection Norm
- URL: http://arxiv.org/abs/2202.05834v1
- Date: Fri, 11 Feb 2022 18:58:21 GMT
- Title: Predicting Out-of-Distribution Error with the Projection Norm
- Authors: Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma, Jacob Steinhardt
- Abstract summary: Projection Norm predicts a model's performance on out-of-distribution data without access to ground truth labels.
We find that Projection Norm is the only approach that achieves non-trivial detection performance on adversarial examples.
- Score: 87.61489137914693
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a metric -- Projection Norm -- to predict a model's performance on
out-of-distribution (OOD) data without access to ground truth labels.
Projection Norm first uses model predictions to pseudo-label test samples and
then trains a new model on the pseudo-labels. The more the new model's
parameters differ from an in-distribution model, the greater the predicted OOD
error. Empirically, our approach outperforms existing methods on both image and
text classification tasks and across different network architectures.
Theoretically, we connect our approach to a bound on the test error for
overparameterized linear models. Furthermore, we find that Projection Norm is
the only approach that achieves non-trivial detection performance on
adversarial examples. Our code is available at
https://github.com/yaodongyu/ProjNorm.
Related papers
- Deep Limit Model-free Prediction in Regression [0.0]
We provide a Model-free approach based on Deep Neural Network (DNN) to accomplish point prediction and prediction interval under a general regression setting.
Our method is more stable and accurate compared to other DNN-based counterparts, especially for optimal point predictions.
arXiv Detail & Related papers (2024-08-18T16:37:53Z) - Evaluating Model Bias Requires Characterizing its Mistakes [19.777130236160712]
We introduce SkewSize, a principled and flexible metric that captures bias from mistakes in a model's predictions.
It can be used in multi-class settings or generalised to the open vocabulary setting of generative models.
We demonstrate the utility of SkewSize in multiple settings including: standard vision models trained on synthetic data, vision models trained on ImageNet, and large scale vision-and-language models from the BLIP-2 family.
arXiv Detail & Related papers (2024-07-15T11:46:21Z) - MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts [25.643876327918544]
Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution samples.
Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias.
We propose MaNo which applies a data-dependent normalization on the logits to reduce prediction bias and takes the $L_p$ norm of the matrix of normalized logits as the estimation score.
arXiv Detail & Related papers (2024-05-29T10:45:06Z) - Out-of-Distribution Detection with a Single Unconditional Diffusion Model [54.15132801131365]
Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples.
Traditionally, unsupervised methods utilize a deep generative model for OOD detection.
This paper explores whether a single model can perform OOD detection across diverse tasks.
arXiv Detail & Related papers (2024-05-20T08:54:03Z) - A moment-matching metric for latent variable generative models [0.0]
In scope of Goodhart's law, when a metric becomes a target it ceases to be a good metric.
We propose a new metric for model comparison or regularization that relies on moments.
It is common to draw samples from the fitted distribution when evaluating latent variable models.
arXiv Detail & Related papers (2021-10-04T17:51:08Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - BENN: Bias Estimation Using Deep Neural Network [37.70583323420925]
We present BENN -- a novel bias estimation method that uses a pretrained unsupervised deep neural network.
Given a ML model and data samples, BENN provides a bias estimation for every feature based on the model's predictions.
We evaluated BENN using three benchmark datasets and one proprietary churn prediction model used by a European Telco.
arXiv Detail & Related papers (2020-12-23T08:25:35Z) - Positive-Congruent Training: Towards Regression-Free Model Updates [87.25247195148187]
In image classification, sample-wise inconsistencies appear as "negative flips"
A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model.
We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model.
arXiv Detail & Related papers (2020-11-18T09:00:44Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.