Related papers: Linear Regression with Distributed Learning: A Generalization Error Perspective

Linear Regression with Distributed Learning: A Generalization Error Perspective

URL: http://arxiv.org/abs/2101.09001v1
Date: Fri, 22 Jan 2021 08:43:28 GMT
Title: Linear Regression with Distributed Learning: A Generalization Error Perspective
Authors: Martin Hellkvist and Ay\c{c}a \"Oz\c{c}elikkale and Anders Ahl\'en
Abstract summary: We investigate the performance of distributed learning for large-scale linear regression. We focus on the generalization error, i.e., the performance on unseen data. Our results show that the generalization error of the distributed solution can be substantially higher than that of the centralized solution.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear regression where the model parameters, i.e., the unknowns, are distributed over the network. We adopt a statistical learning approach. In contrast to works that focus on the performance on the training data, we focus on the generalization error, i.e., the performance on unseen data. We provide high-probability bounds on the generalization error for both isotropic and correlated Gaussian data as well as sub-gaussian data. These results reveal the dependence of the generalization performance on the partitioning of the model over the network. In particular, our results show that the generalization error of the distributed solution can be substantially higher than that of the centralized solution even when the error on the training data is at the same level for both the centralized and distributed approaches. Our numerical results illustrate the performance with both real-world image data as well as synthetic data.

Related papers

Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning with Data Heterogeneity [23.509076905112526]
We present fine-grained error analysis for both attack-free and Byzantine-resilient decentralized learning with heterogeneous data.<n>We also reveal that attacks performed by malicious agents largely affect the error.
arXiv Detail & Related papers (2025-06-11T06:44:34Z)
Heterogeneity Matters even More in Distributed Learning: Study from Generalization Perspective [14.480713752871523]
(K) clients have each (n) training samples generated independently according to a possibly different data distribution. We study the effect of discrepancy between the clients' data distributions on the generalization error of the aggregated model. It is shown that the bound gets smaller as the degree of data heterogeneity across clients gets higher.
arXiv Detail & Related papers (2025-03-03T14:33:38Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex. We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold. We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z)
Graph Out-of-Distribution Generalization with Controllable Data Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties. Due to the selection bias of training and testing data, distribution deviation is widespread. We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z)
Using Focal Loss to Fight Shallow Heuristics: An Empirical Analysis of Modulated Cross-Entropy in Natural Language Inference [0.0]
In some datasets, deep neural networks discover underlyings that allow them to take shortcuts in the learning process, resulting in poor generalization capability. Instead of using standard cross-entropy, we explore whether a modulated version of cross-entropy called focal loss can constrain the model so as not to use underlyings and improve generalization performance. Our experiments in natural language inference show that focal loss has a regularizing impact on the learning process, increasing accuracy on out-of-distribution data, but slightly decreasing performance on in-distribution data.
arXiv Detail & Related papers (2022-11-23T22:19:00Z)
Studying Generalization Through Data Averaging [0.0]
We study train and test performance, as well as the generalization gap given by the mean of their difference over different data set samples. We predict some aspects about how the generalization gap and model train and test performance vary as a function of SGD noise.
arXiv Detail & Related papers (2022-06-28T00:03:40Z)
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization [118.50301177912381]
We show that Adam can converge to different solutions of the objective with provably different errors, even with weight decay globalization. We show that if convex, and the weight decay regularization is employed, any optimization algorithms including Adam will converge to the same solution.
arXiv Detail & Related papers (2021-08-25T17:58:21Z)
Double Descent and Other Interpolation Phenomena in GANs [2.7007335372861974]
We study the generalization error as a function of latent space dimension in generative adversarial networks (GANs) We develop a novel pseudo-supervised learning approach for GANs where the training utilizes pairs of fabricated (noise) inputs in conjunction with real output samples. While our analysis focuses mostly on linear models, we also apply important insights for improving generalization of nonlinear, multilayer GANs.
arXiv Detail & Related papers (2021-06-07T23:07:57Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers. We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model. Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation. We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
Generalization Error for Linear Regression under Distributed Learning [0.0]
We consider the setting where the unknowns are distributed over a network of nodes. We present an analytical characterization of the dependence of the generalization error on the partitioning of the unknowns over nodes.
arXiv Detail & Related papers (2020-04-30T08:49:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.