Related papers: Federated learning with hierarchical clustering of local updates to improve training on non-IID data

Federated learning with hierarchical clustering of local updates to improve training on non-IID data

URL: http://arxiv.org/abs/2004.11791v2
Date: Wed, 6 May 2020 16:28:10 GMT
Title: Federated learning with hierarchical clustering of local updates to improve training on non-IID data
Authors: Christopher Briggs, Zhong Fan, Peter Andras
Abstract summary: We show that learning a single joint model is often not optimal in the presence of certain types of non-iid data. We present a modification to FL by introducing a hierarchical clustering step (FL+HC) We show how FL+HC allows model training to converge in fewer communication rounds compared to FL without clustering.
Score: 3.3517146652431378
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning (FL) is a well established method for performing machine learning tasks over massively distributed data. However in settings where data is distributed in a non-iid (not independent and identically distributed) fashion -- as is typical in real world situations -- the joint model produced by FL suffers in terms of test set accuracy and/or communication costs compared to training on iid data. We show that learning a single joint model is often not optimal in the presence of certain types of non-iid data. In this work we present a modification to FL by introducing a hierarchical clustering step (FL+HC) to separate clusters of clients by the similarity of their local updates to the global joint model. Once separated, the clusters are trained independently and in parallel on specialised models. We present a robust empirical analysis of the hyperparameters for FL+HC for several iid and non-iid settings. We show how FL+HC allows model training to converge in fewer communication rounds (significantly so under some non-iid settings) compared to FL without clustering. Additionally, FL+HC allows for a greater percentage of clients to reach a target accuracy compared to standard FL. Finally we make suggestions for good default hyperparameters to promote superior performing specialised models without modifying the the underlying federated learning communication protocol.

Related papers

Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z)
Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning? [50.03434441234569]
Federated Learning (FL) has gained significant popularity due to its effectiveness in training machine learning models across diverse sites without requiring direct data sharing. While various algorithms have shown that FL with local updates is a communication-efficient distributed learning framework, the generalization performance of FL with local updates has received comparatively less attention.
arXiv Detail & Related papers (2024-09-05T19:00:18Z)
FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering [26.478852701376294]
Federated learning (FL) is an emerging distributed machine learning paradigm. One of the major challenges in FL is the presence of uneven data distributions across client devices. We propose em FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients.
arXiv Detail & Related papers (2024-07-09T02:47:16Z)
FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data. Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z)
Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data [10.64629029156029]
We introduce an innovative personalized Federated Learning framework, Multi-level Personalized Federated Learning (MuPFL) MuPFL integrates three pivotal modules: Biased Activation Value Dropout (BAVD), Adaptive Cluster-based Model Update (ACMU) and Prior Knowledge-assisted Fine-tuning (PKCF) Experiments on diverse real-world datasets show that MuPFL consistently outperforms state-of-the-art baselines, even under extreme non-i.i.d. and long-tail conditions.
arXiv Detail & Related papers (2024-05-10T11:52:53Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Contrastive encoder pre-training-based clustered federated learning for heterogeneous data [17.580390632874046]
Federated learning (FL) enables distributed clients to collaboratively train a global model while preserving their data privacy. We propose contrastive pre-training-based clustered federated learning (CP-CFL) to improve the model convergence and overall performance of FL systems.
arXiv Detail & Related papers (2023-11-28T05:44:26Z)
Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training [67.67045085186797]
Almost all existing systems have to face large communication burdens if the central FL server fails. It personalizes the "right" in the deep models by alternately updating the shared and personal parameters. To further promote the shared parameters aggregation process, we propose DFed integrating the local Sharpness Miniization.
arXiv Detail & Related papers (2023-05-24T13:52:18Z)
Adaptive Personlization in Federated Learning for Highly Non-i.i.d. Data [37.667379000751325]
Federated learning (FL) is a distributed learning method that offers medical institutes the prospect of collaboration in a global model. In this work, we investigate an adaptive hierarchical clustering method for FL to produce intermediate semi-global models. Our experiments demonstrate significant performance gain in heterogeneous distribution compared to standard FL methods in classification accuracy.
arXiv Detail & Related papers (2022-07-07T17:25:04Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
Multi-Center Federated Learning [62.32725938999433]
Federated learning (FL) can protect data privacy in distributed learning. It merely collects local gradients from users without access to their data. We propose a novel multi-center aggregation mechanism.
arXiv Detail & Related papers (2021-08-19T12:20:31Z)
Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.