Unlocking the Potential of Federated Learning: The Symphony of Dataset
Distillation via Deep Generative Latents
- URL: http://arxiv.org/abs/2312.01537v1
- Date: Sun, 3 Dec 2023 23:30:48 GMT
- Title: Unlocking the Potential of Federated Learning: The Symphony of Dataset
Distillation via Deep Generative Latents
- Authors: Yuqi Jia and Saeed Vahidian and Jingwei Sun and Jianyi Zhang and
Vyacheslav Kungurtsev and Neil Zhenqiang Gong and Yiran Chen
- Abstract summary: We propose a highly efficient FL dataset distillation framework on the server side.
Unlike previous strategies, our technique enables the server to leverage prior knowledge from pre-trained deep generative models.
Our framework converges faster than the baselines because rather than the server trains on several sets of heterogeneous data distributions, it trains on a multi-modal distribution.
- Score: 43.282328554697564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data heterogeneity presents significant challenges for federated learning
(FL). Recently, dataset distillation techniques have been introduced, and
performed at the client level, to attempt to mitigate some of these challenges.
In this paper, we propose a highly efficient FL dataset distillation framework
on the server side, significantly reducing both the computational and
communication demands on local devices while enhancing the clients' privacy.
Unlike previous strategies that perform dataset distillation on local devices
and upload synthetic data to the server, our technique enables the server to
leverage prior knowledge from pre-trained deep generative models to synthesize
essential data representations from a heterogeneous model architecture. This
process allows local devices to train smaller surrogate models while enabling
the training of a larger global model on the server, effectively minimizing
resource utilization. We substantiate our claim with a theoretical analysis,
demonstrating the asymptotic resemblance of the process to the hypothetical
ideal of completely centralized training on a heterogeneous dataset. Empirical
evidence from our comprehensive experiments indicates our method's superiority,
delivering an accuracy enhancement of up to 40% over non-dataset-distillation
techniques in highly heterogeneous FL contexts, and surpassing existing
dataset-distillation methods by 18%. In addition to the high accuracy, our
framework converges faster than the baselines because rather than the server
trains on several sets of heterogeneous data distributions, it trains on a
multi-modal distribution. Our code is available at
https://github.com/FedDG23/FedDG-main.git
Related papers
- Modality Alignment Meets Federated Broadcasting [9.752555511824593]
Federated learning (FL) has emerged as a powerful approach to safeguard data privacy by training models across distributed edge devices without centralizing local data.
This paper introduces a novel FL framework leveraging modality alignment, where a text encoder resides on the server, and image encoders operate on local devices.
arXiv Detail & Related papers (2024-11-24T13:30:03Z) - FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation [32.305134875959226]
Federated learning (FL) is a privacy-preserving paradigm that enables distributed clients to collaboratively train models with a central server.
We propose FedHPL, a parameter-efficient unified $textbfFed$erated learning framework for $textbfH$eterogeneous settings.
We show that our framework outperforms state-of-the-art FL approaches, with less overhead and training rounds.
arXiv Detail & Related papers (2024-05-27T15:25:32Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - FLIGAN: Enhancing Federated Learning with Incomplete Data using GAN [1.5749416770494706]
Federated Learning (FL) provides a privacy-preserving mechanism for distributed training of machine learning models on networked devices.
We propose FLIGAN, a novel approach to address the issue of data incompleteness in FL.
Our methodology adheres to FL's privacy requirements by generating synthetic data in a federated manner without sharing the actual data in the process.
arXiv Detail & Related papers (2024-03-25T16:49:38Z) - FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm.
It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities.
It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - DFRD: Data-Free Robustness Distillation for Heterogeneous Federated
Learning [20.135235291912185]
Federated Learning (FL) is a privacy-constrained decentralized machine learning paradigm.
We propose a new FL method (namely DFRD) to learn a robust global model in the data-heterogeneous and model-heterogeneous FL scenarios.
arXiv Detail & Related papers (2023-09-24T04:29:22Z) - Federated Virtual Learning on Heterogeneous Data with Local-global
Distillation [17.998623216905496]
Federated Virtual Learning on Heterogeneous Data with Local-Global Distillation (FedLGD)
We propose a new method, called Federated Virtual Learning on Heterogeneous Data with Local-Global Distillation (FedLGD)
Our method outperforms state-of-the-art heterogeneous FL algorithms under various settings with a very limited amount of distilled virtual data.
arXiv Detail & Related papers (2023-03-04T00:35:29Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.