Accuracy on the Curve: On the Nonlinear Correlation of ML Performance
Between Data Subpopulations
- URL: http://arxiv.org/abs/2305.02995v2
- Date: Wed, 31 May 2023 17:41:16 GMT
- Title: Accuracy on the Curve: On the Nonlinear Correlation of ML Performance
Between Data Subpopulations
- Authors: Weixin Liang, Yining Mao, Yongchan Kwon, Xinyu Yang, James Zou
- Abstract summary: We show that correlation between in-distribution (ID) and out-of-distribution (OOD) accuracies is more nuanced under subpopulation shifts.
Our work highlights the importance of understanding the nonlinear effects of model improvement on performance in different subpopulations.
- Score: 24.579430688134185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the performance of machine learning (ML) models across diverse
data distributions is critically important for reliable applications. Despite
recent empirical studies positing a near-perfect linear correlation between
in-distribution (ID) and out-of-distribution (OOD) accuracies, we empirically
demonstrate that this correlation is more nuanced under subpopulation shifts.
Through rigorous experimentation and analysis across a variety of datasets,
models, and training epochs, we demonstrate that OOD performance often has a
nonlinear correlation with ID performance in subpopulation shifts. Our
findings, which contrast previous studies that have posited a linear
correlation in model performance during distribution shifts, reveal a "moon
shape" correlation (parabolic uptrend curve) between the test performance on
the majority subpopulation and the minority subpopulation. This non-trivial
nonlinear correlation holds across model architectures, hyperparameters,
training durations, and the imbalance between subpopulations. Furthermore, we
found that the nonlinearity of this "moon shape" is causally influenced by the
degree of spurious correlations in the training data. Our controlled
experiments show that stronger spurious correlation in the training data
creates more nonlinear performance correlation. We provide complementary
experimental and theoretical analyses for this phenomenon, and discuss its
implications for ML reliability and fairness. Our work highlights the
importance of understanding the nonlinear effects of model improvement on
performance in different subpopulations, and has the potential to inform the
development of more equitable and responsible machine learning models.
Related papers
- Random Features Outperform Linear Models: Effect of Strong Input-Label Correlation in Spiked Covariance Data [0.8287206589886879]
We show that a high correlation between inputs and labels is a critical factor enabling the RFM to outperform linear models.
We show that the RFM performs equivalent to noisy models, where the degree depends on the strength of the correlation between inputs and labels.
arXiv Detail & Related papers (2024-09-30T12:40:45Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - On the Importance of Feature Separability in Predicting
Out-Of-Distribution Error [25.995311155942016]
We propose a dataset-level score based upon feature dispersion to estimate the test accuracy under distribution shift.
Our method is inspired by desirable properties of features in representation learning: high inter-class dispersion and high intra-class compactness.
arXiv Detail & Related papers (2023-03-27T09:52:59Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - Improving Prediction of Cognitive Performance using Deep Neural Networks
in Sparse Data [2.867517731896504]
We used data from an observational, cohort study, Midlife in the United States (MIDUS) to model executive function and episodic memory measures.
Deep neural network (DNN) models consistently ranked highest in all of the cognitive performance prediction tasks.
arXiv Detail & Related papers (2021-12-28T22:23:08Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.