CCFC++: Enhancing Federated Clustering through Feature Decorrelation
- URL: http://arxiv.org/abs/2402.12852v1
- Date: Tue, 20 Feb 2024 09:31:03 GMT
- Title: CCFC++: Enhancing Federated Clustering through Feature Decorrelation
- Authors: Jie Yan, Jing Liu, Yi-Zi Ning and Zhong-Yuan Zhang
- Abstract summary: In federated clustering, multiple data-holding clients collaboratively group data without exchanging raw data.
CCFC suffers from heterogeneous data across clients, leading to poor and unrobust performance.
To address this, we introduce a decorrelation regularizer to CCFC.
- Score: 8.822947930471429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In federated clustering, multiple data-holding clients collaboratively group
data without exchanging raw data. This field has seen notable advancements
through its marriage with contrastive learning, exemplified by
Cluster-Contrastive Federated Clustering (CCFC). However, CCFC suffers from
heterogeneous data across clients, leading to poor and unrobust performance.
Our study conducts both empirical and theoretical analyses to understand the
impact of heterogeneous data on CCFC. Findings indicate that increased data
heterogeneity exacerbates dimensional collapse in CCFC, evidenced by increased
correlations across multiple dimensions of the learned representations. To
address this, we introduce a decorrelation regularizer to CCFC. Benefiting from
the regularizer, the improved method effectively mitigates the detrimental
effects of data heterogeneity, and achieves superior performance, as evidenced
by a marked increase in NMI scores, with the gain reaching as high as 0.32 in
the most pronounced case.
Related papers
- Dynamic Clustering for Personalized Federated Learning on Heterogeneous Edge Devices [10.51330114955586]
Federated Learning (FL) enables edge devices to collaboratively learn a global model.<n>We propose a dynamic clustering algorithm for personalized federated learning system (DC-PFL)<n>We show that DC-PFL significantly reduces total training time and improves model accuracy compared to baselines.
arXiv Detail & Related papers (2025-08-03T04:19:22Z) - dcFCI: Robust Causal Discovery Under Latent Confounding, Unfaithfulness, and Mixed Data [1.9797215742507548]
We introduce the first nonparametric score to assess a Partial Ancestral Graph's compatibility with observed data.<n>We then propose data-compatible Fast Causal Inference (dcFCI) to jointly address latent confounding, empirical unfaithfulness, and mixed data types.
arXiv Detail & Related papers (2025-05-10T07:05:19Z) - Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy.
We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution.
Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z) - Diversity Over Quantity: A Lesson From Few Shot Relation Classification [62.66895901654023]
We show that training on a diverse set of relations significantly enhances a model's ability to generalize to unseen relations.
We introduce REBEL-FS, a new FSRC benchmark that incorporates an order of magnitude more relation types than existing datasets.
arXiv Detail & Related papers (2024-12-06T21:41:01Z) - Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - Adaptive Self-supervised Robust Clustering for Unstructured Data with Unknown Cluster Number [12.926206811876174]
We introduce a novel self-supervised deep clustering approach tailored for unstructured data, termed Adaptive Self-supervised Robust Clustering (ASRC)
ASRC adaptively learns the graph structure and edge weights to capture both local and global structural information.
ASRC even outperforms methods that rely on prior knowledge of the number of clusters, highlighting its effectiveness in addressing the challenges of clustering unstructured data.
arXiv Detail & Related papers (2024-07-29T15:51:09Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - CCFC: Bridging Federated Clustering and Contrastive Learning [9.91610928326645]
We propose a new federated clustering method named cluster-contrastive federated clustering (CCFC)
CCFC shows superior performance in handling device failures from a practical viewpoint.
arXiv Detail & Related papers (2024-01-12T15:26:44Z) - Generalizable Heterogeneous Federated Cross-Correlation and Instance
Similarity Learning [60.058083574671834]
This paper presents a novel FCCL+, federated correlation and similarity learning with non-target distillation.
For heterogeneous issue, we leverage irrelevant unlabeled public data for communication.
For catastrophic forgetting in local updating stage, FCCL+ introduces Federated Non Target Distillation.
arXiv Detail & Related papers (2023-09-28T09:32:27Z) - Federated cINN Clustering for Accurate Clustered Federated Learning [33.72494731516968]
Federated Learning (FL) presents an innovative approach to privacy-preserving distributed machine learning.
We propose the Federated cINN Clustering Algorithm (FCCA) to robustly cluster clients into different groups.
arXiv Detail & Related papers (2023-09-04T10:47:52Z) - On Counterfactual Data Augmentation Under Confounding [30.76982059341284]
Counterfactual data augmentation has emerged as a method to mitigate confounding biases in the training data.
These biases arise due to various observed and unobserved confounding variables in the data generation process.
We show how our simple augmentation method helps existing state-of-the-art methods achieve good results.
arXiv Detail & Related papers (2023-05-29T16:20:23Z) - Cluster-guided Contrastive Graph Clustering Network [53.16233290797777]
We propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC)
We construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks.
To construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples.
arXiv Detail & Related papers (2023-01-03T13:42:38Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Augmentation-induced Consistency Regularization for Classification [25.388324221293203]
We propose a consistency regularization framework based on data augmentation, called CR-Aug.
CR-Aug forces the output distributions of different sub models generated by data augmentation to be consistent with each other.
We implement CR-Aug to image and audio classification tasks and conduct extensive experiments to verify its effectiveness.
arXiv Detail & Related papers (2022-05-25T03:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.