Studying Generalization Through Data Averaging
- URL: http://arxiv.org/abs/2206.13669v1
- Date: Tue, 28 Jun 2022 00:03:40 GMT
- Title: Studying Generalization Through Data Averaging
- Authors: Carlos A. Gomez-Uribe
- Abstract summary: We study train and test performance, as well as the generalization gap given by the mean of their difference over different data set samples.
We predict some aspects about how the generalization gap and model train and test performance vary as a function of SGD noise.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The generalization of machine learning models has a complex dependence on the
data, model and learning algorithm. We study train and test performance, as
well as the generalization gap given by the mean of their difference over
different data set samples to understand their ``typical" behavior. We derive
an expression for the gap as a function of the covariance between the model
parameter distribution and the train loss, and another expression for the
average test performance, showing test generalization only depends on
data-averaged parameter distribution and the data-averaged loss. We show that
for a large class of model parameter distributions a modified generalization
gap is always non-negative. By specializing further to parameter distributions
produced by stochastic gradient descent (SGD), along with a few approximations
and modeling considerations, we are able to predict some aspects about how the
generalization gap and model train and test performance vary as a function of
SGD noise. We evaluate these predictions empirically on the Cifar10
classification task based on a ResNet architecture.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Aggregation Weighting of Federated Learning via Generalization Bound
Estimation [65.8630966842025]
Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions.
We replace the aforementioned weighting method with a new strategy that considers the generalization bounds of each local model.
arXiv Detail & Related papers (2023-11-10T08:50:28Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Linear Regression with Distributed Learning: A Generalization Error
Perspective [0.0]
We investigate the performance of distributed learning for large-scale linear regression.
We focus on the generalization error, i.e., the performance on unseen data.
Our results show that the generalization error of the distributed solution can be substantially higher than that of the centralized solution.
arXiv Detail & Related papers (2021-01-22T08:43:28Z) - Memorizing without overfitting: Bias, variance, and interpolation in
over-parameterized models [0.0]
The bias-variance trade-off is a central concept in supervised learning.
Modern Deep Learning methods flout this dogma, achieving state-of-the-art performance.
arXiv Detail & Related papers (2020-10-26T22:31:04Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.