Empowering HWNs with Efficient Data Labeling: A Clustered Federated
Semi-Supervised Learning Approach
- URL: http://arxiv.org/abs/2401.10646v1
- Date: Fri, 19 Jan 2024 11:47:49 GMT
- Title: Empowering HWNs with Efficient Data Labeling: A Clustered Federated
Semi-Supervised Learning Approach
- Authors: Moqbel Hamood and Abdullatif Albaseer and Mohamed Abdallah and Ala
Al-Fuqaha
- Abstract summary: Clustered Federated Multitask Learning (CFL) has gained considerable attention as an effective strategy for overcoming statistical challenges.
We introduce a novel framework, Clustered Federated Semi-Supervised Learning (CFSL), designed for more realistic HWN scenarios.
Our results demonstrate that CFSL significantly improves upon key metrics such as testing accuracy, labeling accuracy, and labeling latency under varying proportions of labeled and unlabeled data.
- Score: 2.046985601687158
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustered Federated Multitask Learning (CFL) has gained considerable
attention as an effective strategy for overcoming statistical challenges,
particularly when dealing with non independent and identically distributed (non
IID) data across multiple users. However, much of the existing research on CFL
operates under the unrealistic premise that devices have access to accurate
ground truth labels. This assumption becomes especially problematic in
hierarchical wireless networks (HWNs), where edge networks contain a large
amount of unlabeled data, resulting in slower convergence rates and increased
processing times, particularly when dealing with two layers of model
aggregation. To address these issues, we introduce a novel framework, Clustered
Federated Semi-Supervised Learning (CFSL), designed for more realistic HWN
scenarios. Our approach leverages a best-performing specialized model
algorithm, wherein each device is assigned a specialized model that is highly
adept at generating accurate pseudo-labels for unlabeled data, even when the
data stems from diverse environments. We validate the efficacy of CFSL through
extensive experiments, comparing it with existing methods highlighted in recent
literature. Our numerical results demonstrate that CFSL significantly improves
upon key metrics such as testing accuracy, labeling accuracy, and labeling
latency under varying proportions of labeled and unlabeled data while also
accommodating the non-IID nature of the data and the unique characteristics of
wireless edge networks.
Related papers
- Optimizing Federated Learning by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance.
FedOptEnt is designed to mitigate performance issues caused by label distribution skew.
The proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy.
arXiv Detail & Related papers (2024-11-02T13:31:36Z) - (FL)$^2$: Overcoming Few Labels in Federated Semi-Supervised Learning [4.803231218533992]
Federated Learning (FL) is a distributed machine learning framework that trains accurate global models while preserving clients' privacy-sensitive data.
Most FL approaches assume that clients possess labeled data, which is often not the case in practice.
We propose $(FL)2$, a robust training method for unlabeled clients using sharpness-aware consistency regularization.
arXiv Detail & Related papers (2024-10-30T17:15:02Z) - Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - FedAnchor: Enhancing Federated Semi-Supervised Learning with Label
Contrastive Loss for Unlabeled Clients [19.3885479917635]
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices.
We propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server.
Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples.
arXiv Detail & Related papers (2024-02-15T18:48:21Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with
Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data.
We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets.
We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Hybridization of Capsule and LSTM Networks for unsupervised anomaly
detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network.
The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z) - Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning.
It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model.
It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.