Federated Virtual Learning on Heterogeneous Data with Local-global
Distillation
- URL: http://arxiv.org/abs/2303.02278v2
- Date: Mon, 5 Jun 2023 18:43:26 GMT
- Title: Federated Virtual Learning on Heterogeneous Data with Local-global
Distillation
- Authors: Chun-Yin Huang, Ruinan Jin, Can Zhao, Daguang Xu, and Xiaoxiao Li
- Abstract summary: Federated Virtual Learning on Heterogeneous Data with Local-Global Distillation (FedLGD)
We propose a new method, called Federated Virtual Learning on Heterogeneous Data with Local-Global Distillation (FedLGD)
Our method outperforms state-of-the-art heterogeneous FL algorithms under various settings with a very limited amount of distilled virtual data.
- Score: 17.998623216905496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite Federated Learning (FL)'s trend for learning machine learning models
in a distributed manner, it is susceptible to performance drops when training
on heterogeneous data. In addition, FL inevitability faces the challenges of
synchronization, efficiency, and privacy. Recently, dataset distillation has
been explored in order to improve the efficiency and scalability of FL by
creating a smaller, synthetic dataset that retains the performance of a model
trained on the local private datasets. We discover that using distilled local
datasets can amplify the heterogeneity issue in FL. To address this, we propose
a new method, called Federated Virtual Learning on Heterogeneous Data with
Local-Global Distillation (FedLGD), which trains FL using a smaller synthetic
dataset (referred as virtual data) created through a combination of local and
global dataset distillation. Specifically, to handle synchronization and class
imbalance, we propose iterative distribution matching to allow clients to have
the same amount of balanced local virtual data; to harmonize the domain shifts,
we use federated gradient matching to distill global virtual data that are
shared with clients without hindering data privacy to rectify heterogeneous
local training via enforcing local-global feature similarity. We experiment on
both benchmark and real-world datasets that contain heterogeneous data from
different sources, and further scale up to an FL scenario that contains large
number of clients with heterogeneous and class imbalance data. Our method
outperforms state-of-the-art heterogeneous FL algorithms under various settings
with a very limited amount of distilled virtual data.
Related papers
- Federated Impression for Learning with Distributed Heterogeneous Data [19.50235109938016]
Federated learning (FL) provides a paradigm that can learn from distributed datasets across clients without requiring them to share data.
In FL, sub-optimal convergence is common among data from different health centers due to the variety in data collection protocols and patient demographics across centers.
We propose FedImpres which alleviates catastrophic forgetting by restoring synthetic data that represents the global information as federated impression.
arXiv Detail & Related papers (2024-09-11T15:37:52Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm.
It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities.
It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z) - Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - Unlocking the Potential of Federated Learning: The Symphony of Dataset
Distillation via Deep Generative Latents [43.282328554697564]
We propose a highly efficient FL dataset distillation framework on the server side.
Unlike previous strategies, our technique enables the server to leverage prior knowledge from pre-trained deep generative models.
Our framework converges faster than the baselines because rather than the server trains on several sets of heterogeneous data distributions, it trains on a multi-modal distribution.
arXiv Detail & Related papers (2023-12-03T23:30:48Z) - Integrating Local Real Data with Global Gradient Prototypes for
Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively.
The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model.
In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z) - The Best of Both Worlds: Accurate Global and Personalized Models through
Federated Learning with Data-Free Hyper-Knowledge Distillation [17.570719572024608]
FedHKD (Federated Hyper-Knowledge Distillation) is a novel FL algorithm in which clients rely on knowledge distillation to train local models.
Unlike other KD-based pFL methods, FedHKD does not rely on a public dataset nor it deploys a generative model at the server.
We conduct extensive experiments on visual datasets in a variety of scenarios, demonstrating that FedHKD provides significant improvement in both personalized as well as global model performance.
arXiv Detail & Related papers (2023-01-21T16:20:57Z) - Virtual Homogeneity Learning: Defending against Data Heterogeneity in
Federated Learning [34.97057620481504]
We propose a new approach named virtual homogeneity learning (VHL) to "rectify" the data heterogeneity.
VHL conducts federated learning with a virtual homogeneous dataset crafted to satisfy two conditions: containing no private information and being separable.
Empirically, we demonstrate that VHL endows federated learning with drastically improved convergence speed and generalization performance.
arXiv Detail & Related papers (2022-06-06T10:02:21Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.