Efficient Distribution Similarity Identification in Clustered Federated
Learning via Principal Angles Between Client Data Subspaces
- URL: http://arxiv.org/abs/2209.10526v1
- Date: Wed, 21 Sep 2022 17:37:54 GMT
- Title: Efficient Distribution Similarity Identification in Clustered Federated
Learning via Principal Angles Between Client Data Subspaces
- Authors: Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev,
Chen Chen, Mubarak Shah, and Bill Lin
- Abstract summary: Clustered learning has been shown to produce promising results by grouping clients into clusters.
Existing FL algorithms are essentially trying to group clients together with similar distributions.
Prior FL algorithms attempt similarities indirectly during training.
- Score: 59.33965805898736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustered federated learning (FL) has been shown to produce promising results
by grouping clients into clusters. This is especially effective in scenarios
where separate groups of clients have significant differences in the
distributions of their local data. Existing clustered FL algorithms are
essentially trying to group together clients with similar distributions so that
clients in the same cluster can leverage each other's data to better perform
federated learning. However, prior clustered FL algorithms attempt to learn
these distribution similarities indirectly during training, which can be quite
time consuming as many rounds of federated learning may be required until the
formation of clusters is stabilized. In this paper, we propose a new approach
to federated learning that directly aims to efficiently identify distribution
similarities among clients by analyzing the principal angles between the client
data subspaces. Each client applies a truncated singular value decomposition
(SVD) step on its local data in a single-shot manner to derive a small set of
principal vectors, which provides a signature that succinctly captures the main
characteristics of the underlying distribution. This small set of principal
vectors is provided to the server so that the server can directly identify
distribution similarities among the clients to form clusters. This is achieved
by comparing the similarities of the principal angles between the client data
subspaces spanned by those principal vectors. The approach provides a simple,
yet effective clustered FL framework that addresses a broad range of data
heterogeneity issues beyond simpler forms of Non-IIDness like label skews. Our
clustered FL approach also enables convergence guarantees for non-convex
objectives. Our code is available at https://github.com/MMorafah/PACFL.
Related papers
- A Bayesian Framework for Clustered Federated Learning [14.426129993432193]
One of the main challenges of federated learning (FL) is handling non-independent and identically distributed (non-IID) client data.
We present a unified Bayesian framework for clustered FL which associates clients to clusters.
This work provides insights on client-cluster associations and enables client knowledge sharing in new ways.
arXiv Detail & Related papers (2024-10-20T19:11:24Z) - FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering [26.478852701376294]
Federated learning (FL) is an emerging distributed machine learning paradigm.
One of the major challenges in FL is the presence of uneven data distributions across client devices.
We propose em FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients.
arXiv Detail & Related papers (2024-07-09T02:47:16Z) - FedCCL: Federated Dual-Clustered Feature Contrast Under Domain Heterogeneity [43.71967577443732]
Federated learning (FL) facilitates a privacy-preserving neural network training paradigm through collaboration between edge clients and a central server.
Recent research is limited to simply using averaged signals as a form of regularization and only focusing on one aspect of these non-IID challenges.
We propose a dual-clustered feature contrast-based FL framework with dual focuses.
arXiv Detail & Related papers (2024-04-14T13:56:30Z) - FedClust: Optimizing Federated Learning on Non-IID Data through
Weight-Driven Client Clustering [28.057411252785176]
Federated learning (FL) is an emerging distributed machine learning paradigm enabling collaborative model training on decentralized devices without exposing their local data.
This paper proposes FedClust, a novel CFL approach leveraging correlations between local model weights and client data distributions.
arXiv Detail & Related papers (2024-03-07T01:50:36Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - On the Convergence of Clustered Federated Learning [57.934295064030636]
In a federated learning system, the clients, e.g. mobile devices and organization participants, usually have different personal preferences or behavior patterns.
This paper proposes a novel weighted client-based clustered FL algorithm to leverage the client's group and each client in a unified optimization framework.
arXiv Detail & Related papers (2022-02-13T02:39:19Z) - Federated Learning with Taskonomy for Non-IID Data [0.0]
We introduce federated learning with taskonomy.
In a one-off process, the server provides the clients with a pretrained (and fine-tunable) encoder to compress their data into a latent representation, and transmit the signature of their data back to the server.
The server then learns the task-relatedness among clients via manifold learning, and performs a generalization of federated averaging.
arXiv Detail & Related papers (2021-03-29T20:47:45Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.