Related papers: FedMerge: Federated Personalization via Model Merging

FedMerge: Federated Personalization via Model Merging

URL: http://arxiv.org/abs/2504.06768v2
Date: Thu, 24 Apr 2025 11:12:13 GMT
Title: FedMerge: Federated Personalization via Model Merging
Authors: Shutong Chen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang,
Abstract summary: One global model might not be sufficient to serve many clients with non-IID tasks and distributions.<n>We propose a novel FedMerge'' approach that can create a personalized model per client by simply merging multiple global models.<n>We evaluate FedMerge on three different non-IID settings applied to different domains with diverse tasks and data types.
Score: 51.12769696559237
License: http://creativecommons.org/licenses/by/4.0/
Abstract: One global model in federated learning (FL) might not be sufficient to serve many clients with non-IID tasks and distributions. While there has been advances in FL to train multiple global models for better personalization, they only provide limited choices to clients so local finetuning is still indispensable. In this paper, we propose a novel ``FedMerge'' approach that can create a personalized model per client by simply merging multiple global models with automatically optimized and customized weights. In FedMerge, a few global models can serve many non-IID clients, even without further local finetuning. We formulate this problem as a joint optimization of global models and the merging weights for each client. Unlike existing FL approaches where the server broadcasts one or multiple global models to all clients, the server only needs to send a customized, merged model to each client. Moreover, instead of periodically interrupting the local training and re-initializing it to a global model, the merged model aligns better with each client's task and data distribution, smoothening the local-global gap between consecutive rounds caused by client drift. We evaluate FedMerge on three different non-IID settings applied to different domains with diverse tasks and data types, in which FedMerge consistently outperforms existing FL approaches, including clustering-based and mixture-of-experts (MoE) based methods.

Related papers

Multi-Level Additive Modeling for Structured Non-IID Federated Learning [54.53672323071204]
We train models organized in a multi-level structure, called Multi-level Additive Models (MAM)'', for better knowledge-sharing across heterogeneous clients. In federated MAM (FeMAM), each client is assigned to at most one model per level and its personalized prediction sums up the outputs of models assigned to it across all levels. Experiments show that FeMAM surpasses existing clustered FL and personalized FL methods in various non-IID settings.
arXiv Detail & Related papers (2024-05-26T07:54:53Z)
FAM: fast adaptive federated meta-learning [10.980548731600116]
We propose a fast adaptive federated meta-learning (FAM) framework for collaboratively learning a single global model. A skeleton network is grown on each client to train a personalized model by learning additional client-specific parameters from local data. The personalized client models outperformed the locally trained models, demonstrating the efficacy of the FAM mechanism.
arXiv Detail & Related papers (2023-08-26T22:54:45Z)
Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection. We find that the difference in logits between the local and global models increases as the model is continuously updated. We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z)
Personalized Federated Learning with Multi-branch Architecture [0.0]
Federated learning (FL) enables multiple clients to collaboratively train models without requiring clients to reveal their raw data to each other. We propose a new PFL method (pFedMB) using multi-branch architecture, which achieves personalization by splitting each layer of a neural network into multiple branches and assigning client-specific weights to each branch. We experimentally show that pFedMB performs better than the state-of-the-art PFL methods using the CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2022-11-15T06:30:57Z)
A Bayesian Federated Learning Framework with Online Laplace Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model. We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side. We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z)
Personalized Federated Learning with First Order Model Optimization [76.81546598985159]
We propose an alternative to federated learning, where each client federates with other relevant clients to obtain a stronger model per client-specific objectives. We do not assume knowledge of underlying data distributions or client similarities, and allow each client to optimize for arbitrary target distributions of interest. Our method outperforms existing alternatives, while also enabling new features for personalized FL such as transfer outside of local data distributions.
arXiv Detail & Related papers (2020-12-15T19:30:29Z)
Federated Mutual Learning [65.46254760557073]
Federated Mutual Leaning (FML) allows clients training a generalized model collaboratively and a personalized model independently. The experiments show that FML can achieve better performance than alternatives in typical Federated learning setting.
arXiv Detail & Related papers (2020-06-27T09:35:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.