CCFC++: Enhancing Federated Clustering through Feature Decorrelation
- URL: http://arxiv.org/abs/2402.12852v1
- Date: Tue, 20 Feb 2024 09:31:03 GMT
- Title: CCFC++: Enhancing Federated Clustering through Feature Decorrelation
- Authors: Jie Yan, Jing Liu, Yi-Zi Ning and Zhong-Yuan Zhang
- Abstract summary: In federated clustering, multiple data-holding clients collaboratively group data without exchanging raw data.
CCFC suffers from heterogeneous data across clients, leading to poor and unrobust performance.
To address this, we introduce a decorrelation regularizer to CCFC.
- Score: 8.822947930471429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In federated clustering, multiple data-holding clients collaboratively group
data without exchanging raw data. This field has seen notable advancements
through its marriage with contrastive learning, exemplified by
Cluster-Contrastive Federated Clustering (CCFC). However, CCFC suffers from
heterogeneous data across clients, leading to poor and unrobust performance.
Our study conducts both empirical and theoretical analyses to understand the
impact of heterogeneous data on CCFC. Findings indicate that increased data
heterogeneity exacerbates dimensional collapse in CCFC, evidenced by increased
correlations across multiple dimensions of the learned representations. To
address this, we introduce a decorrelation regularizer to CCFC. Benefiting from
the regularizer, the improved method effectively mitigates the detrimental
effects of data heterogeneity, and achieves superior performance, as evidenced
by a marked increase in NMI scores, with the gain reaching as high as 0.32 in
the most pronounced case.
Related papers
- Dual-Segment Clustering Strategy for Federated Learning in Heterogeneous Environments [25.405210975577834]
Federated learning (FL) is a distributed machine learning paradigm with high efficiency and low communication load.
The non-independent and identically distributed (Non-IID) data characteristic has a negative impact on this paradigm.
This letter proposes a dual-segment clustering (DSC) strategy, which first clusters the clients according to the heterogeneous communication conditions.
arXiv Detail & Related papers (2024-05-15T11:46:47Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - CCFC: Bridging Federated Clustering and Contrastive Learning [9.91610928326645]
We propose a new federated clustering method named cluster-contrastive federated clustering (CCFC)
CCFC shows superior performance in handling device failures from a practical viewpoint.
arXiv Detail & Related papers (2024-01-12T15:26:44Z) - Generalizable Heterogeneous Federated Cross-Correlation and Instance
Similarity Learning [60.058083574671834]
This paper presents a novel FCCL+, federated correlation and similarity learning with non-target distillation.
For heterogeneous issue, we leverage irrelevant unlabeled public data for communication.
For catastrophic forgetting in local updating stage, FCCL+ introduces Federated Non Target Distillation.
arXiv Detail & Related papers (2023-09-28T09:32:27Z) - Federated cINN Clustering for Accurate Clustered Federated Learning [33.72494731516968]
Federated Learning (FL) presents an innovative approach to privacy-preserving distributed machine learning.
We propose the Federated cINN Clustering Algorithm (FCCA) to robustly cluster clients into different groups.
arXiv Detail & Related papers (2023-09-04T10:47:52Z) - On Counterfactual Data Augmentation Under Confounding [30.76982059341284]
Counterfactual data augmentation has emerged as a method to mitigate confounding biases in the training data.
These biases arise due to various observed and unobserved confounding variables in the data generation process.
We show how our simple augmentation method helps existing state-of-the-art methods achieve good results.
arXiv Detail & Related papers (2023-05-29T16:20:23Z) - Cluster-guided Contrastive Graph Clustering Network [53.16233290797777]
We propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC)
We construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks.
To construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples.
arXiv Detail & Related papers (2023-01-03T13:42:38Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Augmentation-induced Consistency Regularization for Classification [25.388324221293203]
We propose a consistency regularization framework based on data augmentation, called CR-Aug.
CR-Aug forces the output distributions of different sub models generated by data augmentation to be consistent with each other.
We implement CR-Aug to image and audio classification tasks and conduct extensive experiments to verify its effectiveness.
arXiv Detail & Related papers (2022-05-25T03:15:36Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.