FedFusion: Federated Learning with Diversity- and Cluster-Aware Encoders for Robust Adaptation under Label Scarcity
- URL: http://arxiv.org/abs/2509.19220v1
- Date: Tue, 23 Sep 2025 16:46:06 GMT
- Title: FedFusion: Federated Learning with Diversity- and Cluster-Aware Encoders for Robust Adaptation under Label Scarcity
- Authors: Ferdinand Kahenga, Antoine Bagula, Patrick Sello, Sajal K. Das,
- Abstract summary: FedFusion is a federated transfer-learning framework that unifies domain adaptation and frugal labelling.<n>Labelled teacher clients guide learner clients via confidence-filtered pseudo-labels and domain-adaptive transfer.<n>FedFusion consistently outperforms state-of-the-art baselines in accuracy, robustness, and fairness.
- Score: 33.17279604575767
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Federated learning in practice must contend with heterogeneous feature spaces, severe non-IID data, and scarce labels across clients. We present FedFusion, a federated transfer-learning framework that unifies domain adaptation and frugal labelling with diversity-/cluster-aware encoders (DivEn, DivEn-mix, DivEn-c). Labelled teacher clients guide learner clients via confidence-filtered pseudo-labels and domain-adaptive transfer, while clients maintain personalised encoders tailored to local data. To preserve global coherence under heterogeneity, FedFusion employs similarity-weighted classifier coupling (with optional cluster-wise averaging), mitigating dominance by data-rich sites and improving minority-client performance. The frugal-labelling pipeline combines self-/semi-supervised pretext training with selective fine-tuning, reducing annotation demands without sharing raw data. Across tabular and imaging benchmarks under IID, non-IID, and label-scarce regimes, FedFusion consistently outperforms state-of-the-art baselines in accuracy, robustness, and fairness while maintaining comparable communication and computation budgets. These results show that harmonising personalisation, domain adaptation, and label efficiency is an effective recipe for robust federated learning under real-world constraints.
Related papers
- Overcoming label shift with target-aware federated learning [10.355835466049092]
Federated learning enables multiple actors to collaboratively train models without sharing private data.<n>A common reason is label shift -- that the label distributions differ between clients and the target domain.<n>We propose FedPALS, a principled and practical model aggregation scheme that adapts to label shifts to improve performance in the target domain.
arXiv Detail & Related papers (2024-11-06T09:52:45Z) - Boosting Federated Learning with FedEntOpt: Mitigating Label Skew by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance.<n>FedEntOpt is designed to mitigate performance issues caused by label distribution skew.<n>It exhibits robust and superior performance in scenarios with low participation rates and client dropout.
arXiv Detail & Related papers (2024-11-02T13:31:36Z) - Federated Contrastive Learning for Personalized Semantic Communication [55.46383524190467]
We design a federated contrastive learning framework aimed at supporting personalized semantic communication.
FedCL enables collaborative training of local semantic encoders across multiple clients and a global semantic decoder owned by the base station.
To tackle the semantic imbalance issue arising from heterogeneous datasets across distributed clients, we employ contrastive learning to train a semantic centroid generator.
arXiv Detail & Related papers (2024-06-13T14:45:35Z) - FedCoSR: Personalized Federated Learning with Contrastive Shareable Representations for Label Heterogeneity in Non-IID Data [11.389706928654965]
This paper proposes a novel personalized learning algorithm, named Federated Contrastive Shareable Representations (FedCoSR)<n>The parameters of local models' shallow layers and typical local representations are both considered as shareable information for the server.<n>To address performance degradation caused by label distribution skew among clients, contrastive learning is adopted between local and global representations.
arXiv Detail & Related papers (2024-04-27T14:05:18Z) - FedAnchor: Enhancing Federated Semi-Supervised Learning with Label
Contrastive Loss for Unlabeled Clients [19.3885479917635]
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices.
We propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server.
Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples.
arXiv Detail & Related papers (2024-02-15T18:48:21Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - FedFA: Federated Learning with Feature Anchors to Align Features and
Classifiers for Heterogeneous Data [8.677832361022809]
Federated learning allows multiple clients to collaboratively train a model without exchanging their data.
Common solutions involve an auxiliary loss to regularize weight divergence or feature inconsistency during local training.
We propose a novel framework named Federated learning with Feature Anchors (FedFA)
arXiv Detail & Related papers (2022-11-17T02:27:44Z) - FedFM: Anchor-based Feature Matching for Data Heterogeneity in Federated
Learning [91.74206675452888]
We propose a novel method FedFM, which guides each client's features to match shared category-wise anchors.
To achieve higher efficiency and flexibility, we propose a FedFM variant, called FedFM-Lite, where clients communicate with server with fewer synchronization times and communication bandwidth costs.
arXiv Detail & Related papers (2022-10-14T08:11:34Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated
Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation.
Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS)
In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.