Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data
- URL: http://arxiv.org/abs/2305.07845v4
- Date: Fri, 31 May 2024 02:57:28 GMT
- Title: Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data
- Authors: Tailin Zhou, Zehong Lin, Jun Zhang, Danny H. K. Tsang,
- Abstract summary: We study the loss landscape of model averaging in federated learning (FL)
We decompose the expected loss of the global model into five factors related to the client models.
We propose utilizing IMA on the global model at the late training phase to reduce its deviation from the expected speed.
- Score: 9.792805355704203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model averaging is a widely adopted technique in federated learning (FL) that aggregates multiple client models to obtain a global model. Remarkably, model averaging in FL yields a superior global model, even when client models are trained with non-convex objective functions and on heterogeneous local datasets. However, the rationale behind its success remains poorly understood. To shed light on this issue, we first visualize the loss landscape of FL over client and global models to illustrate their geometric properties. The visualization shows that the client models encompass the global model within a common basin, and interestingly, the global model may deviate from the basin's center while still outperforming the client models. To gain further insights into model averaging in FL, we decompose the expected loss of the global model into five factors related to the client models. Specifically, our analysis reveals that the global model loss after early training mainly arises from \textit{i)} the client model's loss on non-overlapping data between client datasets and the global dataset and \textit{ii)} the maximum distance between the global and client models. Based on the findings from our loss landscape visualization and loss decomposition, we propose utilizing iterative moving averaging (IMA) on the global model at the late training phase to reduce its deviation from the expected minimum, while constraining client exploration to limit the maximum distance between the global and client models. Our experiments demonstrate that incorporating IMA into existing FL methods significantly improves their accuracy and training speed on various heterogeneous data setups of benchmark datasets. Code is available at \url{https://github.com/TailinZhou/FedIMA}.
Related papers
- An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Mitigating System Bias in Resource Constrained Asynchronous Federated
Learning Systems [2.8790600498444032]
We propose a dynamic global model aggregation method within Asynchronous Federated Learning (AFL) deployments.
Our method scores and adjusts the weighting of client model updates based on their upload frequency to accommodate differences in device capabilities.
arXiv Detail & Related papers (2024-01-24T10:51:15Z) - FedSoup: Improving Generalization and Personalization in Federated
Learning via Selective Model Interpolation [32.36334319329364]
Cross-silo federated learning (FL) enables the development of machine learning models on datasets distributed across data centers.
Recent research has found that current FL algorithms face a trade-off between local and global performance when confronted with distribution shifts.
We propose a novel federated model soup method to optimize the trade-off between local and global performance.
arXiv Detail & Related papers (2023-07-20T00:07:29Z) - Federated Learning for Semantic Parsing: Task Formulation, Evaluation
Setup, New Algorithms [29.636944156801327]
Multiple clients collaboratively train one global model without sharing their semantic parsing data.
Lorar adjusts each client's contribution to the global model update based on its training loss reduction during each round.
Clients with smaller datasets enjoy larger performance gains.
arXiv Detail & Related papers (2023-05-26T19:25:49Z) - DYNAFED: Tackling Client Data Heterogeneity with Global Dynamics [60.60173139258481]
Local training on non-iid distributed data results in deflected local optimum.
A natural solution is to gather all client data onto the server, such that the server has a global view of the entire data distribution.
In this paper, we put forth an idea to collect and leverage global knowledge on the server without hindering data privacy.
arXiv Detail & Related papers (2022-11-20T06:13:06Z) - Closing the Gap between Client and Global Model Performance in
Heterogeneous Federated Learning [2.1044900734651626]
We show how the chosen approach for training custom client models has an impact on the global model.
We propose a new approach that combines KD and Learning without Forgetting (LwoF) to produce improved personalised models.
arXiv Detail & Related papers (2022-11-07T11:12:57Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - No One Left Behind: Inclusive Federated Learning over Heterogeneous
Devices [79.16481453598266]
We propose InclusiveFL, a client-inclusive federated learning method to handle this problem.
The core idea of InclusiveFL is to assign models of different sizes to clients with different computing capabilities.
We also propose an effective method to share the knowledge among multiple local models with different sizes.
arXiv Detail & Related papers (2022-02-16T13:03:27Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - FedProf: Optimizing Federated Learning with Dynamic Data Profiling [9.74942069718191]
Federated Learning (FL) has shown great potential as a privacy-preserving solution to learning from decentralized data.
A large proportion of the clients are probably in possession of only low-quality data that are biased, noisy or even irrelevant.
We propose a novel approach to optimizing FL under such circumstances without breaching data privacy.
arXiv Detail & Related papers (2021-02-02T20:10:14Z) - Federated Learning With Quantized Global Model Updates [84.55126371346452]
We study federated learning, which enables mobile devices to utilize their local datasets to train a global model.
We introduce a lossy FL (LFL) algorithm, in which both the global model and the local model updates are quantized before being transmitted.
arXiv Detail & Related papers (2020-06-18T16:55:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.