Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift
- URL: http://arxiv.org/abs/2006.10963v3
- Date: Thu, 14 Jan 2021 21:11:06 GMT
- Title: Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift
- Authors: Zachary Nado, Shreyas Padhy, D. Sculley, Alexander D'Amour, Balaji
Lakshminarayanan, Jasper Snoek
- Abstract summary: We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
- Score: 81.74795324629712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Covariate shift has been shown to sharply degrade both predictive accuracy
and the calibration of uncertainty estimates for deep learning models. This is
worrying, because covariate shift is prevalent in a wide range of real world
deployment settings. However, in this paper, we note that frequently there
exists the potential to access small unlabeled batches of the shifted data just
before prediction time. This interesting observation enables a simple but
surprisingly effective method which we call prediction-time batch
normalization, which significantly improves model accuracy and calibration
under covariate shift. Using this one line code change, we achieve
state-of-the-art on recent covariate shift benchmarks and an mCE of 60.28\% on
the challenging ImageNet-C dataset; to our knowledge, this is the best result
for any model that does not incorporate additional data augmentation or
modification of the training pipeline. We show that prediction-time batch
normalization provides complementary benefits to existing state-of-the-art
approaches for improving robustness (e.g. deep ensembles) and combining the two
further improves performance. Our findings are supported by detailed
measurements of the effect of this strategy on model behavior across rigorous
ablations on various dataset modalities. However, the method has mixed results
when used alongside pre-training, and does not seem to perform as well under
more natural types of dataset shift, and is therefore worthy of additional
study. We include links to the data in our figures to improve reproducibility,
including a Python notebooks that can be run to easily modify our analysis at
https://colab.research.google.com/drive/11N0wDZnMQQuLrRwRoumDCrhSaIhkqjof.
Related papers
- Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data [39.40116554523575]
We present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network.
It learns to approximate Bayesian inference on synthetic datasets drawn from a prior.
It improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration.
arXiv Detail & Related papers (2024-11-15T23:49:23Z) - Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Towards Backwards-Compatible Data with Confounded Domain Adaptation [0.0]
We seek to achieve general-purpose data backwards compatibility by modifying generalized label shift (GLS)
We present a novel framework for this problem, based on minimizing the expected divergence between the source and target conditional distributions.
We provide concrete implementations using the Gaussian reverse Kullback-Leibler divergence and the maximum mean discrepancy.
arXiv Detail & Related papers (2022-03-23T20:53:55Z) - Using calibrator to improve robustness in Machine Reading Comprehension [18.844528744164876]
We propose a method to improve the robustness by using a calibrator as the post-hoc reranker.
Experimental results on adversarial datasets show that our model can achieve performance improvement by more than 10%.
arXiv Detail & Related papers (2022-02-24T02:16:42Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Efficient remedies for outlier detection with variational autoencoders [8.80692072928023]
Likelihoods computed by deep generative models are a candidate metric for outlier detection with unlabeled data.
We show that a theoretically-grounded correction readily ameliorates a key bias with VAE likelihood estimates.
We also show that the variance of the likelihoods computed over an ensemble of VAEs also enables robust outlier detection.
arXiv Detail & Related papers (2021-08-19T16:00:58Z) - Backward-Compatible Prediction Updates: A Probabilistic Approach [12.049279991559091]
We formalize the Prediction Update Problem and present an efficient probabilistic approach as answer to the above questions.
In extensive experiments on standard classification benchmark data sets, we show that our method outperforms alternative strategies for backward-compatible prediction updates.
arXiv Detail & Related papers (2021-07-02T13:05:31Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.