Data Augmentation in the Underparameterized and Overparameterized
Regimes
- URL: http://arxiv.org/abs/2202.09134v3
- Date: Thu, 28 Sep 2023 17:44:51 GMT
- Title: Data Augmentation in the Underparameterized and Overparameterized
Regimes
- Authors: Kevin Han Huang, Peter Orbanz, Morgane Austern
- Abstract summary: We quantify how data augmentation affects the variance and limiting distribution of estimates.
The results confirm some observations made in machine learning practice, but also lead to unexpected findings.
- Score: 7.326504492614808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We provide results that exactly quantify how data augmentation affects the
variance and limiting distribution of estimates, and analyze several specific
models in detail. The results confirm some observations made in machine
learning practice, but also lead to unexpected findings: Data augmentation may
increase rather than decrease the uncertainty of estimates, such as the
empirical prediction risk. It can act as a regularizer, but fails to do so in
certain high-dimensional problems, and it may shift the double-descent peak of
an empirical risk. Overall, the analysis shows that several properties data
augmentation has been attributed with are not either true or false, but rather
depend on a combination of factors -- notably the data distribution, the
properties of the estimator, and the interplay of sample size, number of
augmentations, and dimension. Our main theoretical tool is a limit theorem for
functions of randomly transformed, high-dimensional random vectors. The proof
draws on work in probability on noise stability of functions of many variables.
Related papers
- Universality of High-Dimensional Logistic Regression and a Novel CGMT under Dependence with Applications to Data Augmentation [6.092792437962955]
We prove that Gaussian universality still holds for high-dimensional logistic regression under block dependence.
We establish the impact of data augmentation, a widespread practice in deep learning, on the risk.
arXiv Detail & Related papers (2025-02-10T18:04:53Z) - Evidential time-to-event prediction model with well-calibrated uncertainty estimation [12.446406577462069]
We introduce an evidential regression model designed especially for time-to-event prediction tasks.
The most plausible event time is directly quantified by aggregated Gaussian random fuzzy numbers (GRFNs)
Our model achieves both accurate and reliable performance, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2024-11-12T15:06:04Z) - The Breakdown of Gaussian Universality in Classification of High-dimensional Linear Factor Mixtures [6.863637695977277]
We provide a high-dimensional characterization of empirical risk minimization for classification under a general mixture data setting.
To understand the impact of its breakdown, we specify conditions for Gaussian universality and discuss their implications for the choice of loss function.
arXiv Detail & Related papers (2024-10-08T01:45:37Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.
We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile [0.0]
We study the predictive risk of the ridge estimator for linear regression with a variance profile.
For certain class of variance profile, our work highlights the emergence of the well-known double descent phenomenon.
We also investigate the similarities and differences that exist with the standard setting of independent and identically distributed data.
arXiv Detail & Related papers (2024-03-29T14:24:49Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Toward Robust Uncertainty Estimation with Random Activation Functions [3.0586855806896045]
We propose a novel approach for uncertainty quantification via ensembles, called Random Activation Functions (RAFs) Ensemble.
RAFs Ensemble outperforms state-of-the-art ensemble uncertainty quantification methods on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-02-28T13:17:56Z) - Monotonicity and Double Descent in Uncertainty Estimation with Gaussian
Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.
We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension.
We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - BayesIMP: Uncertainty Quantification for Causal Data Fusion [52.184885680729224]
We study the causal data fusion problem, where datasets pertaining to multiple causal graphs are combined to estimate the average treatment effect of a target variable.
We introduce a framework which combines ideas from probabilistic integration and kernel mean embeddings to represent interventional distributions in the reproducing kernel Hilbert space.
arXiv Detail & Related papers (2021-06-07T10:14:18Z) - Aleatoric uncertainty for Errors-in-Variables models in deep regression [0.48733623015338234]
We show how the concept of Errors-in-Variables can be used in Bayesian deep regression.
We discuss the approach along various simulated and real examples.
arXiv Detail & Related papers (2021-05-19T12:37:02Z) - Deconfounded Score Method: Scoring DAGs with Dense Unobserved
Confounding [101.35070661471124]
We show that unobserved confounding leaves a characteristic footprint in the observed data distribution that allows for disentangling spurious and causal effects.
We propose an adjusted score-based causal discovery algorithm that may be implemented with general-purpose solvers and scales to high-dimensional problems.
arXiv Detail & Related papers (2021-03-28T11:07:59Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Identification of Latent Variables From Graphical Model Residuals [0.0]
We present a novel method to control for the latent space when estimating a DAG by iteratively deriving proxies for the latent space from the residuals of the inferred model.
We show that any improvement of prediction of an outcome is intrinsically capped and cannot rise beyond a certain limit as compared to the confounded model.
arXiv Detail & Related papers (2021-01-07T02:28:49Z) - Information Theory Measures via Multidimensional Gaussianization [7.788961560607993]
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems.
It has several desirable properties for real world applications.
However, obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality.
arXiv Detail & Related papers (2020-10-08T07:22:16Z) - On Data Augmentation and Adversarial Risk: An Empirical Analysis [9.586672294115075]
We analyse the effect of different data augmentation techniques on the adversarial risk by three measures.
We disprove the hypothesis that an improvement in the classification performance induced by a data augmentation is always accompanied by an improvement in the risk under adversarial attack.
Our results reveal that the augmented data has more influence than the non-augmented data, on the resulting models.
arXiv Detail & Related papers (2020-07-06T11:16:18Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.