A Simple Data Augmentation for Feature Distribution Skewed Federated
Learning
- URL: http://arxiv.org/abs/2306.09363v1
- Date: Wed, 14 Jun 2023 05:46:52 GMT
- Title: A Simple Data Augmentation for Feature Distribution Skewed Federated
Learning
- Authors: Yunlu Yan, Lei Zhu
- Abstract summary: Federated learning (FL) facilitates collaborative learning among multiple clients in a distributed manner, while ensuring privacy protection.
In this paper, we focus on the feature distribution skewed FL scenario, which is widespread in real-world applications.
We propose FedRDN, a simple yet remarkably effective data augmentation method for feature distribution skewed FL.
- Score: 12.636154758643757
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) facilitates collaborative learning among multiple
clients in a distributed manner, while ensuring privacy protection. However,
its performance is inevitably degraded as suffering data heterogeneity, i.e.,
non-IID data. In this paper, we focus on the feature distribution skewed FL
scenario, which is widespread in real-world applications. The main challenge
lies in the feature shift caused by the different underlying distributions of
local datasets. While the previous attempts achieved progress, few studies pay
attention to the data itself, the root of this issue. Therefore, the primary
goal of this paper is to develop a general data augmentation technique at the
input level, to mitigate the feature shift. To achieve this goal, we propose
FedRDN, a simple yet remarkably effective data augmentation method for feature
distribution skewed FL, which randomly injects the statistics of the dataset
from the entire federation into the client's data. By this, our method can
effectively improve the generalization of features, thereby mitigating the
feature shift. Moreover, FedRDN is a plug-and-play component, which can be
seamlessly integrated into the data augmentation flow with only a few lines of
code. Extensive experiments on several datasets show that the performance of
various representative FL works can be further improved by combining them with
FedRDN, which demonstrates the strong scalability and generalizability of
FedRDN. The source code will be released.
Related papers
- Disentangling data distribution for Federated Learning [20.524108508314107]
Federated Learning (FL) facilitates collaborative training of a global model whose performance is boosted by private data owned by distributed clients.
Yet the wide applicability of FL is hindered by entanglement of data distributions across different clients.
This paper demonstrates for the first time that by disentangling data distributions FL can in principle achieve efficiencies comparable to those of distributed systems.
arXiv Detail & Related papers (2024-10-16T13:10:04Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - StatAvg: Mitigating Data Heterogeneity in Federated Learning for Intrusion Detection Systems [22.259297167311964]
Federated learning (FL) is a decentralized learning technique that enables devices to collaboratively build a shared Machine Leaning (ML) or Deep Learning (DL) model without revealing their raw data to a third party.
Due to its privacy-preserving nature, FL has sparked widespread attention for building Intrusion Detection Systems (IDS) within the realm of cybersecurity.
We propose an effective method called Statistical Averaging (StatAvg) to alleviate non-independently and identically (non-iid) distributed features across local clients' data in FL.
arXiv Detail & Related papers (2024-05-20T14:41:59Z) - Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data [9.045647166114916]
Federated Learning (FL) is a promising paradigm for decentralized and collaborative model training.
FL struggles with a significant performance reduction and poor convergence when confronted with Non-Independent and Identically Distributed (Non-IID) data distributions.
We introduce Gen-FedSD, a novel approach that harnesses the powerful capability of state-of-the-art text-to-image foundation models.
arXiv Detail & Related papers (2024-05-13T16:57:48Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm.
It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities.
It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z) - Federated Learning Empowered by Generative Content [55.576885852501775]
Federated learning (FL) enables leveraging distributed private data for model training in a privacy-preserving way.
We propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
We conduct a systematic empirical study on FedGC, covering diverse baselines, datasets, scenarios, and modalities.
arXiv Detail & Related papers (2023-12-10T07:38:56Z) - FedFed: Feature Distillation against Data Heterogeneity in Federated
Learning [88.36513907827552]
Federated learning (FL) typically faces data heterogeneity, i.e., distribution shifting among clients.
We propose a novel approach called textbfFederated textbfFeature textbfdistillation (FedFedFed)
FedFed partitions data into performance-sensitive features (i.e., greatly contributing to model performance) and performance-robust features (i.e., limitedly contributing to model performance)
Comprehensive experiments demonstrate the efficacy of FedFed in promoting model performance.
arXiv Detail & Related papers (2023-10-08T09:00:59Z) - PS-FedGAN: An Efficient Federated Learning Framework Based on Partially
Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation.
This work proposes a novel FL framework that requires only partial GAN model sharing.
Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z) - FedFA: Federated Feature Augmentation [25.130087374092383]
Federated learning allows multiple parties to collaboratively train deep models without exchanging raw data.
The primary goal of this paper is to develop a robust federated learning algorithm to address feature shift in clients' samples.
We propose FedFA to tackle federated learning from a distinct perspective of federated feature augmentation.
arXiv Detail & Related papers (2023-01-30T15:39:55Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.