Generalized Federated Learning via Sharpness Aware Minimization
- URL: http://arxiv.org/abs/2206.02618v1
- Date: Mon, 6 Jun 2022 13:54:41 GMT
- Title: Generalized Federated Learning via Sharpness Aware Minimization
- Authors: Zhe Qu, Xingyu Li, Rui Duan, Yao Liu, Bo Tang, and Zhuo Lu
- Abstract summary: We propose a general, effective algorithm, textttFedSAM, based on Sharpness Aware Minimization (SAM) local, and develop a momentum FL algorithm to bridge local and global models.
Empirically, our proposed algorithms substantially outperform existing FL studies and significantly decrease the learning deviation.
- Score: 22.294290071999736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is a promising framework for performing
privacy-preserving, distributed learning with a set of clients. However, the
data distribution among clients often exhibits non-IID, i.e., distribution
shift, which makes efficient optimization difficult. To tackle this problem,
many FL algorithms focus on mitigating the effects of data heterogeneity across
clients by increasing the performance of the global model. However, almost all
algorithms leverage Empirical Risk Minimization (ERM) to be the local
optimizer, which is easy to make the global model fall into a sharp valley and
increase a large deviation of parts of local clients. Therefore, in this paper,
we revisit the solutions to the distribution shift problem in FL with a focus
on local learning generality. To this end, we propose a general, effective
algorithm, \texttt{FedSAM}, based on Sharpness Aware Minimization (SAM) local
optimizer, and develop a momentum FL algorithm to bridge local and global
models, \texttt{MoFedSAM}. Theoretically, we show the convergence analysis of
these two algorithms and demonstrate the generalization bound of
\texttt{FedSAM}. Empirically, our proposed algorithms substantially outperform
existing FL studies and significantly decrease the learning deviation.
Related papers
- Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization [22.577751005038543]
Federated Learning (FL) is a distributed learning approach that trains neural networks across multiple devices.
FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients.
We introduce the first generalization dynamics analysis framework in federated optimization.
arXiv Detail & Related papers (2024-11-25T11:43:22Z) - Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization [81.32266996009575]
In federated learning (FL), the multi-step update and data heterogeneity among clients often lead to a loss landscape with sharper minima.
We propose FedLESAM, a novel algorithm that locally estimates the direction of global perturbation on client side.
arXiv Detail & Related papers (2024-05-29T08:46:21Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - Learner Referral for Cost-Effective Federated Learning Over Hierarchical
IoT Networks [21.76836812021954]
This paper aided federated selection (LRef-FedCS), communications resource, and local model accuracy (LMAO) methods.
Our proposed LRef-FedCS approach could achieve a good balance between high global accuracy and reducing cost.
arXiv Detail & Related papers (2023-07-19T13:33:43Z) - Understanding How Consistency Works in Federated Learning via Stage-wise
Relaxed Initialization [84.42306265220274]
Federated learning (FL) is a distributed paradigm that coordinates massive local clients to collaboratively train a global model.
Previous works have implicitly studied that FL suffers from the client-drift'' problem, which is caused by the inconsistent optimum across local clients.
To alleviate the negative impact of the client drift'' and explore its substance in FL, we first design an efficient FL algorithm textitFedInit.
arXiv Detail & Related papers (2023-06-09T06:55:15Z) - Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape [59.841889495864386]
In federated learning (FL), a cluster of local clients are chaired under the coordination of a global server.
Clients are prone to overfit into their own optima, which extremely deviates from the global objective.
ttfamily FedSMOO adopts a dynamic regularizer to guarantee the local optima towards the global objective.
Our theoretical analysis indicates that ttfamily FedSMOO achieves fast $mathcalO (1/T)$ convergence rate with low bound generalization.
arXiv Detail & Related papers (2023-05-19T10:47:44Z) - Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated
Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model.
Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Federated Multi-Task Learning under a Mixture of Distributions [10.00087964926414]
Federated Learning (FL) is a framework for on-device collaborative training of machine learning models.
First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client.
We study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions.
arXiv Detail & Related papers (2021-08-23T15:47:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.