Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
- URL: http://arxiv.org/abs/2305.11584v2
- Date: Mon, 1 Apr 2024 04:21:28 GMT
- Title: Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape
- Authors: Yan Sun, Li Shen, Shixiang Chen, Liang Ding, Dacheng Tao,
- Abstract summary: In federated learning (FL), a cluster of local clients are chaired under the coordination of a global server.
Clients are prone to overfit into their own optima, which extremely deviates from the global objective.
ttfamily FedSMOO adopts a dynamic regularizer to guarantee the local optima towards the global objective.
Our theoretical analysis indicates that ttfamily FedSMOO achieves fast $mathcalO (1/T)$ convergence rate with low bound generalization.
- Score: 59.841889495864386
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection. Due to the multiple local updates and the isolated non-iid dataset, clients are prone to overfit into their own optima, which extremely deviates from the global objective and significantly undermines the performance. Most previous works only focus on enhancing the consistency between the local and global objectives to alleviate this prejudicial client drifts from the perspective of the optimization view, whose performance would be prominently deteriorated on the high heterogeneity. In this work, we propose a novel and general algorithm {\ttfamily FedSMOO} by jointly considering the optimization and generalization targets to efficiently improve the performance in FL. Concretely, {\ttfamily FedSMOO} adopts a dynamic regularizer to guarantee the local optima towards the global objective, which is meanwhile revised by the global Sharpness Aware Minimization (SAM) optimizer to search for the consistent flat minima. Our theoretical analysis indicates that {\ttfamily FedSMOO} achieves fast $\mathcal{O}(1/T)$ convergence rate with low generalization bound. Extensive numerical studies are conducted on the real-world dataset to verify its peerless efficiency and excellent generality.
Related papers
- Neighborhood and Global Perturbations Supported SAM in Federated Learning: From Local Tweaks To Global Awareness [29.679323144520037]
Federated Learning (FL) can be coordinated under the orchestration of a central server to build a privacy-preserving model.
We propose a novel FL algorithm, FedTOGA, designed to consider generalization objectives while maintaining minimal uplink communication overhead.
arXiv Detail & Related papers (2024-08-26T09:42:18Z) - Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization [81.32266996009575]
In federated learning (FL), the multi-step update and data heterogeneity among clients often lead to a loss landscape with sharper minima.
We propose FedLESAM, a novel algorithm that locally estimates the direction of global perturbation on client side.
arXiv Detail & Related papers (2024-05-29T08:46:21Z) - Efficient Federated Learning via Local Adaptive Amended Optimizer with
Linear Speedup [90.26270347459915]
We propose a novel momentum-based algorithm via utilizing the global descent locally adaptive.
textitLADA could greatly reduce the communication rounds and achieves higher accuracy than several baselines.
arXiv Detail & Related papers (2023-07-30T14:53:21Z) - G-TRACER: Expected Sharpness Optimization [1.2183405753834562]
G-TRACER promotes generalization by seeking flat minima, and has a sound theoretical basis as an approximation to a natural-gradient descent based optimization of a generalized Bayes objective.
We show that the method converges to a neighborhood of a local minimum of the unregularized objective, and demonstrate competitive performance on a number of benchmark computer vision and NLP datasets.
arXiv Detail & Related papers (2023-06-24T09:28:49Z) - Federated Learning for Semantic Parsing: Task Formulation, Evaluation
Setup, New Algorithms [29.636944156801327]
Multiple clients collaboratively train one global model without sharing their semantic parsing data.
Lorar adjusts each client's contribution to the global model update based on its training loss reduction during each round.
Clients with smaller datasets enjoy larger performance gains.
arXiv Detail & Related papers (2023-05-26T19:25:49Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - Generalized Federated Learning via Sharpness Aware Minimization [22.294290071999736]
We propose a general, effective algorithm, textttFedSAM, based on Sharpness Aware Minimization (SAM) local, and develop a momentum FL algorithm to bridge local and global models.
Empirically, our proposed algorithms substantially outperform existing FL studies and significantly decrease the learning deviation.
arXiv Detail & Related papers (2022-06-06T13:54:41Z) - Fair and Consistent Federated Learning [48.19977689926562]
Federated learning (FL) has gain growing interests for its capability of learning from distributed data sources collectively.
We propose an FL framework to jointly consider performance consistency and algorithmic fairness across different local clients.
arXiv Detail & Related papers (2021-08-19T01:56:08Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.