Understanding Generalization of Federated Learning via Stability:
Heterogeneity Matters
- URL: http://arxiv.org/abs/2306.03824v1
- Date: Tue, 6 Jun 2023 16:12:35 GMT
- Title: Understanding Generalization of Federated Learning via Stability:
Heterogeneity Matters
- Authors: Zhenyu Sun, Xiaochun Niu, Ermin Wei
- Abstract summary: Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications.
Generalization performance is a key metric in evaluating machine learning models when applied to real-world applications.
- Score: 1.4502611532302039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization performance is a key metric in evaluating machine learning
models when applied to real-world applications. Good generalization indicates
the model can predict unseen data correctly when trained under a limited number
of data. Federated learning (FL), which has emerged as a popular distributed
learning framework, allows multiple devices or clients to train a shared model
without violating privacy requirements. While the existing literature has
studied extensively the generalization performances of centralized machine
learning algorithms, similar analysis in the federated settings is either
absent or with very restrictive assumptions on the loss functions. In this
paper, we aim to analyze the generalization performances of federated learning
by means of algorithmic stability, which measures the change of the output
model of an algorithm when perturbing one data point. Three widely-used
algorithms are studied, including FedAvg, SCAFFOLD, and FedProx, under convex
and non-convex loss functions. Our analysis shows that the generalization
performances of models trained by these three algorithms are closely related to
the heterogeneity of clients' datasets as well as the convergence behaviors of
the algorithms. Particularly, in the i.i.d. setting, our results recover the
classical results of stochastic gradient descent (SGD).
Related papers
- Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization [22.577751005038543]
Federated Learning (FL) is a distributed learning approach that trains neural networks across multiple devices.
FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients.
We introduce the first generalization dynamics analysis framework in federated optimization.
arXiv Detail & Related papers (2024-11-25T11:43:22Z) - On the KL-Divergence-based Robust Satisficing Model [2.425685918104288]
robustness satisficing framework has attracted increasing attention from academia.
We present analytical interpretations, diverse performance guarantees, efficient and stable numerical methods, convergence analysis, and an extension tailored for hierarchical data structures.
We demonstrate the superior performance of our model compared to state-of-the-art benchmarks.
arXiv Detail & Related papers (2024-08-17T10:05:05Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Proof of Swarm Based Ensemble Learning for Federated Learning
Applications [3.2536767864585663]
In federated learning it is not feasible to apply centralised ensemble learning directly due to privacy concerns.
Most distributed consensus algorithms, such as Byzantine fault tolerance (BFT), do not normally perform well in such applications.
We propose PoSw, a novel distributed consensus algorithm for ensemble learning in a federated setting.
arXiv Detail & Related papers (2022-12-28T13:53:34Z) - FedGen: Generalizable Federated Learning for Sequential Data [8.784435748969806]
In many real-world distributed settings, spurious correlations exist due to biases and data sampling issues.
We present a generalizable federated learning framework called FedGen, which allows clients to identify and distinguish between spurious and invariant features.
We show that FedGen results in models that achieve significantly better generalization and can outperform the accuracy of current federated learning approaches by over 24%.
arXiv Detail & Related papers (2022-11-03T15:48:14Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Modeling Generalization in Machine Learning: A Methodological and
Computational Study [0.8057006406834467]
We use the concept of the convex hull of the training data in assessing machine learning generalization.
We observe unexpectedly weak associations between the generalization ability of machine learning models and all metrics related to dimensionality.
arXiv Detail & Related papers (2020-06-28T19:06:16Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.