Related papers: Empowering HWNs with Efficient Data Labeling: A Clustered Federated Semi-Supervised Learning Approach

Empowering HWNs with Efficient Data Labeling: A Clustered Federated Semi-Supervised Learning Approach

URL: http://arxiv.org/abs/2401.10646v1
Date: Fri, 19 Jan 2024 11:47:49 GMT
Title: Empowering HWNs with Efficient Data Labeling: A Clustered Federated Semi-Supervised Learning Approach
Authors: Moqbel Hamood and Abdullatif Albaseer and Mohamed Abdallah and Ala Al-Fuqaha
Abstract summary: Clustered Federated Multitask Learning (CFL) has gained considerable attention as an effective strategy for overcoming statistical challenges. We introduce a novel framework, Clustered Federated Semi-Supervised Learning (CFSL), designed for more realistic HWN scenarios. Our results demonstrate that CFSL significantly improves upon key metrics such as testing accuracy, labeling accuracy, and labeling latency under varying proportions of labeled and unlabeled data.
Score: 2.046985601687158
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Clustered Federated Multitask Learning (CFL) has gained considerable attention as an effective strategy for overcoming statistical challenges, particularly when dealing with non independent and identically distributed (non IID) data across multiple users. However, much of the existing research on CFL operates under the unrealistic premise that devices have access to accurate ground truth labels. This assumption becomes especially problematic in hierarchical wireless networks (HWNs), where edge networks contain a large amount of unlabeled data, resulting in slower convergence rates and increased processing times, particularly when dealing with two layers of model aggregation. To address these issues, we introduce a novel framework, Clustered Federated Semi-Supervised Learning (CFSL), designed for more realistic HWN scenarios. Our approach leverages a best-performing specialized model algorithm, wherein each device is assigned a specialized model that is highly adept at generating accurate pseudo-labels for unlabeled data, even when the data stems from diverse environments. We validate the efficacy of CFSL through extensive experiments, comparing it with existing methods highlighted in recent literature. Our numerical results demonstrate that CFSL significantly improves upon key metrics such as testing accuracy, labeling accuracy, and labeling latency under varying proportions of labeled and unlabeled data while also accommodating the non-IID nature of the data and the unique characteristics of wireless edge networks.

Related papers

Stratify: Rethinking Federated Learning for Non-IID Data through Balanced Sampling [9.774529150331297]
Stratify is a novel FL framework designed to systematically manage class and feature distributions throughout training.<n>Inspired by classical stratified sampling, our approach employs a Stratified Label Schedule (SLS) to ensure balanced exposure across labels.<n>To uphold privacy, we implement a secure client selection protocol leveraging homomorphic encryption.
arXiv Detail & Related papers (2025-04-18T04:44:41Z)
Improving the Efficiency of Self-Supervised Adversarial Training through Latent Clustering-Based Selection [2.7554677967598047]
adversarially robust learning is widely recognized to demand significantly more training examples. Recent works propose the use of self-supervised adversarial training with external or synthetically generated unlabeled data to enhance model robustness. We propose novel methods to strategically select a small subset of unlabeled data essential for SSAT and robustness improvement.
arXiv Detail & Related papers (2025-01-15T15:47:49Z)
Optimizing Federated Learning by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance. FedOptEnt is designed to mitigate performance issues caused by label distribution skew. The proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy.
arXiv Detail & Related papers (2024-11-02T13:31:36Z)
(FL)$^2$: Overcoming Few Labels in Federated Semi-Supervised Learning [4.803231218533992]
Federated Learning (FL) is a distributed machine learning framework that trains accurate global models while preserving clients' privacy-sensitive data. Most FL approaches assume that clients possess labeled data, which is often not the case in practice. We propose $(FL)2$, a robust training method for unlabeled clients using sharpness-aware consistency regularization.
arXiv Detail & Related papers (2024-10-30T17:15:02Z)
Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data. This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning. We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z)
FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients [19.3885479917635]
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices. We propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server. Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples.
arXiv Detail & Related papers (2024-02-15T18:48:21Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data. We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets. We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z)
Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants. Our observations are intuitive. Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z)
Hybridization of Capsule and LSTM Networks for unsupervised anomaly detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network. The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z)
Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning. It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model. It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z)
ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications. We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN) We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z)
PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data. We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.