One-Shot Federated Clustering of Non-Independent Completely Distributed Data
- URL: http://arxiv.org/abs/2601.17512v1
- Date: Sat, 24 Jan 2026 16:18:58 GMT
- Title: One-Shot Federated Clustering of Non-Independent Completely Distributed Data
- Authors: Yiqun Zhang, Shenghong Cai, Zihua Yang, Sen Feng, Yuzhu Ji, Haijun Zhang,
- Abstract summary: Unsupervised Federated Clustering (FC) is becoming increasingly popular for exploring pattern knowledge from complex distributed data.<n>In this paper, a more tricky but overlooked phenomenon in Non-IID is revealed, which bottlenecks the clustering performance of the existing FC approaches.<n>A new framework named GOLD (Global Oriented Local Distribution Learning) is proposed to tackle the above FC challenges.
- Score: 20.793249661397162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) that extracts data knowledge while protecting the privacy of multiple clients has achieved remarkable results in distributed privacy-preserving IoT systems, including smart traffic flow monitoring, smart grid load balancing, and so on. Since most data collected from edge devices are unlabeled, unsupervised Federated Clustering (FC) is becoming increasingly popular for exploring pattern knowledge from complex distributed data. However, due to the lack of label guidance, the common Non-Independent and Identically Distributed (Non-IID) issue of clients have greatly challenged FC by posing the following problems: How to fuse pattern knowledge (i.e., cluster distribution) from Non-IID clients; How are the cluster distributions among clients related; and How does this relationship connect with the global knowledge fusion? In this paper, a more tricky but overlooked phenomenon in Non-IID is revealed, which bottlenecks the clustering performance of the existing FC approaches. That is, different clients could fragment a cluster, and accordingly, a more generalized Non-IID concept, i.e., Non-ICD (Non-Independent Completely Distributed), is derived. To tackle the above FC challenges, a new framework named GOLD (Global Oriented Local Distribution Learning) is proposed. GOLD first finely explores the potential incomplete local cluster distributions of clients, then uploads the distribution summarization to the server for global fusion, and finally performs local cluster enhancement under the guidance of the global distribution. Extensive experiments, including significance tests, ablation studies, scalability evaluations, qualitative results, etc., have been conducted to show the superiority of GOLD.
Related papers
- One-Shot Hierarchical Federated Clustering [51.490181220883905]
This paper introduces an efficient one-shot hierarchical Federated Clustering framework.<n>It performs client-end distribution exploration and server-end distribution aggregation.<n>It turns out that the complex cluster distributions across clients can be efficiently explored.
arXiv Detail & Related papers (2026-01-10T02:58:33Z) - Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy.<n>We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution.<n>Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z) - Asynchronous Federated Clustering with Unknown Number of Clusters [35.35189341303029]
Federated Clustering (FC) is crucial to mining knowledge from unlabeled non-Independent Identically Distributed (non-IID) data.<n>This paper proposes an Asynchronous Federated Cluster Learning (AFCL) method accordingly.<n>It spreads the excessive number of seed points to the clients as a learning medium and coordinates them across the clients to form a consensus.
arXiv Detail & Related papers (2024-12-29T03:56:32Z) - Federated Clustering: An Unsupervised Cluster-Wise Training for Decentralized Data Distributions [1.6385815610837167]
Federated Cluster-Wise Refinement (FedCRef) involves clients that collaboratively train models on clusters with similar data distributions.
In these groups, clients collaboratively train a shared model representing each data distribution, while continuously refining their local clusters to enhance data association accuracy.
This iterative process allows our system to identify all potential data distributions across the network and develop robust representation models for each.
arXiv Detail & Related papers (2024-08-20T09:05:44Z) - Federated Deep Multi-View Clustering with Global Self-Supervision [51.639891178519136]
Federated multi-view clustering has the potential to learn a global clustering model from data distributed across multiple devices.
In this setting, label information is unknown and data privacy must be preserved.
We propose a novel federated deep multi-view clustering method that can mine complementary cluster structures from multiple clients.
arXiv Detail & Related papers (2023-09-24T17:07:01Z) - Federated cINN Clustering for Accurate Clustered Federated Learning [33.72494731516968]
Federated Learning (FL) presents an innovative approach to privacy-preserving distributed machine learning.
We propose the Federated cINN Clustering Algorithm (FCCA) to robustly cluster clients into different groups.
arXiv Detail & Related papers (2023-09-04T10:47:52Z) - Federated Generalized Category Discovery [68.35420359523329]
Generalized category discovery (GCD) aims at grouping unlabeled samples from known and unknown classes.
To meet the recent decentralization trend in the community, we introduce a practical yet challenging task, namely Federated GCD (Fed-GCD)
The goal of Fed-GCD is to train a generic GCD model by client collaboration under the privacy-protected constraint.
arXiv Detail & Related papers (2023-05-23T14:27:41Z) - CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with
Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data.
We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets.
We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z) - Efficient Distribution Similarity Identification in Clustered Federated
Learning via Principal Angles Between Client Data Subspaces [59.33965805898736]
Clustered learning has been shown to produce promising results by grouping clients into clusters.
Existing FL algorithms are essentially trying to group clients together with similar distributions.
Prior FL algorithms attempt similarities indirectly during training.
arXiv Detail & Related papers (2022-09-21T17:37:54Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.