Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization
- URL: http://arxiv.org/abs/2107.04649v1
- Date: Fri, 9 Jul 2021 19:48:23 GMT
- Title: Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization
- Authors: John Miller, Rohan Taori, Aditi Raghunathan, Shiori Sagawa, Pang Wei
Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, Ludwig Schmidt
- Abstract summary: We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
- Score: 89.73665256847858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For machine learning systems to be reliable, we must understand their
performance in unseen, out-of-distribution environments. In this paper, we
empirically show that out-of-distribution performance is strongly correlated
with in-distribution performance for a wide range of models and distribution
shifts. Specifically, we demonstrate strong correlations between
in-distribution and out-of-distribution performance on variants of CIFAR-10 &
ImageNet, a synthetic pose estimation task derived from YCB objects, satellite
imagery classification in FMoW-WILDS, and wildlife classification in
iWildCam-WILDS. The strong correlations hold across model architectures,
hyperparameters, training set size, and training duration, and are more precise
than what is expected from existing domain adaptation theory. To complete the
picture, we also investigate cases where the correlation is weaker, for
instance some synthetic distribution shifts from CIFAR-10-C and the tissue
classification dataset Camelyon17-WILDS. Finally, we provide a candidate theory
based on a Gaussian data model that shows how changes in the data covariance
arising from distribution shift can affect the observed correlations.
Related papers
- DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification [14.96980804513399]
Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains.
Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process.
We introduce a more realistic graph data generation model using Structural Causal Models (SCMs)
We propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings.
arXiv Detail & Related papers (2024-10-27T00:22:18Z) - Graph Representation Learning via Causal Diffusion for Out-of-Distribution Recommendation [8.826417093212099]
Graph Neural Networks (GNNs)-based recommendation algorithms assume that training and testing data are drawn from independent and identically distributed spaces.
This assumption often fails in the presence of out-of-distribution (OOD) data, resulting in significant performance degradation.
We propose a novel approach, graph representation learning via causal diffusion (CausalDiffRec) for OOD recommendation.
arXiv Detail & Related papers (2024-08-01T11:51:52Z) - Quantifying Distribution Shifts and Uncertainties for Enhanced Model Robustness in Machine Learning Applications [0.0]
This study explores model adaptation and generalization by utilizing synthetic data.
We employ quantitative measures such as Kullback-Leibler divergence, Jensen-Shannon distance, and Mahalanobis distance to assess data similarity.
Our findings suggest that utilizing statistical measures, such as the Mahalanobis distance, to determine whether model predictions fall within the low-error "interpolation regime" or the high-error "extrapolation regime" provides a complementary method for assessing distribution shift and model uncertainty.
arXiv Detail & Related papers (2024-05-03T10:05:31Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions.
We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts.
We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z) - WILDS: A Benchmark of in-the-Wild Distribution Shifts [157.53410583509924]
Distribution shifts can substantially degrade the accuracy of machine learning systems deployed in the wild.
We present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts.
We show that standard training results in substantially lower out-of-distribution than in-distribution performance.
arXiv Detail & Related papers (2020-12-14T11:14:56Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.