Semi-Supervised Federated Learning with non-IID Data: Algorithm and
System Design
- URL: http://arxiv.org/abs/2110.13388v1
- Date: Tue, 26 Oct 2021 03:41:48 GMT
- Title: Semi-Supervised Federated Learning with non-IID Data: Algorithm and
System Design
- Authors: Zhe Zhang, Shiyao Ma, Jiangtian Nie, Yi Wu, Qiang Yan, Xiaoke Xu and
Dusit Niyato
- Abstract summary: Federated Learning (FL) allows edge devices (or clients) to keep data locally while simultaneously training a shared global model.
The distribution of the client's local training data is non-independent identically distributed (non-IID)
We present a robust semi-supervised FL system design, where the system aims to solve the problem of data availability and non-IID in FL.
- Score: 42.63120623012093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) allows edge devices (or clients) to keep data locally
while simultaneously training a shared high-quality global model. However,
current research is generally based on an assumption that the training data of
local clients have ground-truth. Furthermore, FL faces the challenge of
statistical heterogeneity, i.e., the distribution of the client's local
training data is non-independent identically distributed (non-IID). In this
paper, we present a robust semi-supervised FL system design, where the system
aims to solve the problem of data availability and non-IID in FL. In
particular, this paper focuses on studying the labels-at-server scenario where
there is only a limited amount of labeled data on the server and only unlabeled
data on the clients. In our system design, we propose a novel method to tackle
the problems, which we refer to as Federated Mixing (FedMix). FedMix improves
the naive combination of FL and semi-supervised learning methods and designs
parameter decomposition strategies for disjointed learning of labeled,
unlabeled data, and global models. To alleviate the non-IID problem, we propose
a novel aggregation rule based on the frequency of the client's participation
in training, namely the FedFreq aggregation algorithm, which can adjust the
weight of the corresponding local model according to this frequency. Extensive
evaluations conducted on CIFAR-10 dataset show that the performance of our
proposed method is significantly better than those of the current baseline. It
is worth noting that our system is robust to different non-IID levels of client
data.
Related papers
- A Framework for testing Federated Learning algorithms using an edge-like environment [0.0]
Federated Learning (FL) is a machine learning paradigm in which many clients cooperatively train a single centralized model while keeping their data private and decentralized.
It is non-trivial to accurately evaluate the contributions of local models in global centralized model aggregation.
This is an example of a major challenge in FL, commonly known as data imbalance or class imbalance.
In this work, a framework is proposed and implemented to assess FL algorithms in a more easy and scalable way.
arXiv Detail & Related papers (2024-07-17T19:52:53Z) - FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering [26.478852701376294]
Federated learning (FL) is an emerging distributed machine learning paradigm.
One of the major challenges in FL is the presence of uneven data distributions across client devices.
We propose em FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients.
arXiv Detail & Related papers (2024-07-09T02:47:16Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - StatAvg: Mitigating Data Heterogeneity in Federated Learning for Intrusion Detection Systems [22.259297167311964]
Federated learning (FL) is a decentralized learning technique that enables devices to collaboratively build a shared Machine Leaning (ML) or Deep Learning (DL) model without revealing their raw data to a third party.
Due to its privacy-preserving nature, FL has sparked widespread attention for building Intrusion Detection Systems (IDS) within the realm of cybersecurity.
We propose an effective method called Statistical Averaging (StatAvg) to alleviate non-independently and identically (non-iid) distributed features across local clients' data in FL.
arXiv Detail & Related papers (2024-05-20T14:41:59Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Robust Convergence in Federated Learning through Label-wise Clustering [6.693651193181458]
Non-IID dataset and heterogeneous environment of the local clients are regarded as a major issue in Federated Learning (FL)
We propose a novel Label-wise clustering algorithm that guarantees the trainability among geographically heterogeneous local clients.
Our paper shows that proposed Label-wise clustering demonstrates prompt and robust convergence compared to other FL algorithms.
arXiv Detail & Related papers (2021-12-28T18:13:09Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - FedMix: Approximation of Mixup under Mean Augmented Federated Learning [60.503258658382]
Federated learning (FL) allows edge devices to collectively learn a model without directly sharing data within each device.
Current state-of-the-art algorithms suffer from performance degradation as the heterogeneity of local data across clients increases.
We propose a new augmentation algorithm, named FedMix, which is inspired by a phenomenal yet simple data augmentation method, Mixup.
arXiv Detail & Related papers (2021-07-01T06:14:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.