An Investigation of Why Overparameterization Exacerbates Spurious
Correlations
- URL: http://arxiv.org/abs/2005.04345v3
- Date: Wed, 26 Aug 2020 19:32:58 GMT
- Title: An Investigation of Why Overparameterization Exacerbates Spurious
Correlations
- Authors: Shiori Sagawa, Aditi Raghunathan, Pang Wei Koh, Percy Liang
- Abstract summary: We identify two key properties of the training data that drive this behavior.
We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
- Score: 98.3066727301239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study why overparameterization -- increasing model size well beyond the
point of zero training error -- can hurt test error on minority groups despite
improving average test error when there are spurious correlations in the data.
Through simulations and experiments on two image datasets, we identify two key
properties of the training data that drive this behavior: the proportions of
majority versus minority groups, and the signal-to-noise ratio of the spurious
correlations. We then analyze a linear setting and theoretically show how the
inductive bias of models towards "memorizing" fewer examples can cause
overparameterization to hurt. Our analysis leads to a counterintuitive approach
of subsampling the majority group, which empirically achieves low minority
error in the overparameterized regime, even though the standard approach of
upweighting the minority fails. Overall, our results suggest a tension between
using overparameterized models versus using all the training data for achieving
low worst-group error.
Related papers
- Adversarial Reweighting Guided by Wasserstein Distance for Bias
Mitigation [24.160692009892088]
Under-representation of minorities in the data makes the disparate treatment of subpopulations difficult to deal with during learning.
We propose a novel adversarial reweighting method to address such emphrepresentation bias.
arXiv Detail & Related papers (2023-11-21T15:46:11Z) - How does overparametrization affect performance on minority groups? [39.54853544590893]
We show that over parameterization always improves minority group performance.
In a setting in which the regression functions for the majority and minority groups are different, we show that over parameterization always improves minority group performance.
arXiv Detail & Related papers (2022-06-07T18:00:52Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Parameters or Privacy: A Provable Tradeoff Between Overparameterization
and Membership Inference [29.743945643424553]
Over parameterized models generalize well (small error on the test data) even when trained to memorize the training data (zero error on the training data)
This has led to an arms race towards increasingly over parameterized models (c.f., deep learning)
arXiv Detail & Related papers (2022-02-02T19:00:21Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Memorizing without overfitting: Bias, variance, and interpolation in
over-parameterized models [0.0]
The bias-variance trade-off is a central concept in supervised learning.
Modern Deep Learning methods flout this dogma, achieving state-of-the-art performance.
arXiv Detail & Related papers (2020-10-26T22:31:04Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Compressing Large Sample Data for Discriminant Analysis [78.12073412066698]
We consider the computational issues due to large sample size within the discriminant analysis framework.
We propose a new compression approach for reducing the number of training samples for linear and quadratic discriminant analysis.
arXiv Detail & Related papers (2020-05-08T05:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.