Related papers: Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data

Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data

URL: http://arxiv.org/abs/2405.07925v1
Date: Mon, 13 May 2024 16:57:48 GMT
Title: Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data
Authors: Mahdi Morafah, Matthias Reisser, Bill Lin, Christos Louizos,
Abstract summary: Federated Learning (FL) is a promising paradigm for decentralized and collaborative model training. FL struggles with a significant performance reduction and poor convergence when confronted with Non-Independent and Identically Distributed (Non-IID) data distributions. We introduce Gen-FedSD, a novel approach that harnesses the powerful capability of state-of-the-art text-to-image foundation models.
Score: 9.045647166114916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The proliferation of edge devices has brought Federated Learning (FL) to the forefront as a promising paradigm for decentralized and collaborative model training while preserving the privacy of clients' data. However, FL struggles with a significant performance reduction and poor convergence when confronted with Non-Independent and Identically Distributed (Non-IID) data distributions among participating clients. While previous efforts, such as client drift mitigation and advanced server-side model fusion techniques, have shown some success in addressing this challenge, they often overlook the root cause of the performance reduction - the absence of identical data accurately mirroring the global data distribution among clients. In this paper, we introduce Gen-FedSD, a novel approach that harnesses the powerful capability of state-of-the-art text-to-image foundation models to bridge the significant Non-IID performance gaps in FL. In Gen-FedSD, each client constructs textual prompts for each class label and leverages an off-the-shelf state-of-the-art pre-trained Stable Diffusion model to synthesize high-quality data samples. The generated synthetic data is tailored to each client's unique local data gaps and distribution disparities, effectively making the final augmented local data IID. Through extensive experimentation, we demonstrate that Gen-FedSD achieves state-of-the-art performance and significant communication cost savings across various datasets and Non-IID settings.

Related papers

Adaptive Dual-Weighting Framework for Federated Learning via Out-of-Distribution Detection [53.45696787935487]
Federated Learning (FL) enables collaborative model training across large-scale distributed service nodes.<n>In real-world service-oriented deployments, data generated by heterogeneous users, devices, and application scenarios are inherently non-IID.<n>We propose FLood, a novel FL framework inspired by out-of-distribution (OOD) detection.
arXiv Detail & Related papers (2026-02-01T05:54:59Z)
Federated Loss Exploration for Improved Convergence on Non-IID Data [20.979550470097823]
Federated Loss Exploration (FedLEx) is an innovative approach specifically designed to tackle these challenges.<n>FedLEx distinctively addresses the shortcomings of existing FL methods in non-IID settings.<n>Our experiments with state-of-the art FL algorithms demonstrate significant improvements in performance.
arXiv Detail & Related papers (2025-06-23T13:42:07Z)
Asynchronous Personalized Federated Learning through Global Memorization [16.630360485032163]
Federated Learning offers a privacy preserving solution by enabling collaborative model training across decentralized devices without centralizing sensitive data. We propose the Asynchronous Personalized Federated Learning framework, which empowers clients to develop personalized models using a server side semantic generator. This generator, trained via data free knowledge transfer under global model supervision, enhances client data diversity by producing both seen and unseen samples. To counter the risks of synthetic data impairing training, we introduce a decoupled model method, ensuring robust personalization.
arXiv Detail & Related papers (2025-03-01T09:00:33Z)
FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data. Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
FLIGAN: Enhancing Federated Learning with Incomplete Data using GAN [1.5749416770494706]
Federated Learning (FL) provides a privacy-preserving mechanism for distributed training of machine learning models on networked devices. We propose FLIGAN, a novel approach to address the issue of data incompleteness in FL. Our methodology adheres to FL's privacy requirements by generating synthetic data in a federated manner without sharing the actual data in the process.
arXiv Detail & Related papers (2024-03-25T16:49:38Z)
FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm. It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities. It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z)
One-Shot Federated Learning with Classifier-Guided Diffusion Models [44.604485649167216]
One-shot federated learning (OSFL) has gained attention in recent years due to its low communication cost. In this paper, we explore the novel opportunities that diffusion models bring to OSFL and propose FedCADO. FedCADO generates data that complies with clients' distributions and subsequently training the aggregated model on the server.
arXiv Detail & Related papers (2023-11-15T11:11:25Z)
Leveraging Foundation Models to Improve Lightweight Clients in Federated Learning [16.684749528240587]
Federated Learning (FL) is a distributed training paradigm that enables clients scattered across the world to cooperatively learn a global model without divulging confidential data. FL faces a significant challenge in the form of heterogeneous data distributions among clients, which leads to a reduction in performance and robustness. We introduce foundation model distillation to assist in the federated training of lightweight client models and increase their performance under heterogeneous data settings while keeping inference costs low.
arXiv Detail & Related papers (2023-11-14T19:10:56Z)
FedFed: Feature Distillation against Data Heterogeneity in Federated Learning [88.36513907827552]
Federated learning (FL) typically faces data heterogeneity, i.e., distribution shifting among clients. We propose a novel approach called textbfFederated textbfFeature textbfdistillation (FedFedFed) FedFed partitions data into performance-sensitive features (i.e., greatly contributing to model performance) and performance-robust features (i.e., limitedly contributing to model performance) Comprehensive experiments demonstrate the efficacy of FedFed in promoting model performance.
arXiv Detail & Related papers (2023-10-08T09:00:59Z)
PS-FedGAN: An Efficient Federated Learning Framework Based on Partially Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation. This work proposes a novel FL framework that requires only partial GAN model sharing. Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint. We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG) Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z)
Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data. We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks. We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.