Related papers: Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning

Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning

URL: http://arxiv.org/abs/2206.02465v1
Date: Mon, 6 Jun 2022 10:02:21 GMT
Title: Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning
Authors: Zhenheng Tang, Yonggang Zhang, Shaohuai Shi, Xin He, Bo Han, Xiaowen Chu
Abstract summary: We propose a new approach named virtual homogeneity learning (VHL) to "rectify" the data heterogeneity. VHL conducts federated learning with a virtual homogeneous dataset crafted to satisfy two conditions: containing no private information and being separable. Empirically, we demonstrate that VHL endows federated learning with drastically improved convergence speed and generalization performance.
Score: 34.97057620481504
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In federated learning (FL), model performance typically suffers from client drift induced by data heterogeneity, and mainstream works focus on correcting client drift. We propose a different approach named virtual homogeneity learning (VHL) to directly "rectify" the data heterogeneity. In particular, VHL conducts FL with a virtual homogeneous dataset crafted to satisfy two conditions: containing no private information and being separable. The virtual dataset can be generated from pure noise shared across clients, aiming to calibrate the features from the heterogeneous clients. Theoretically, we prove that VHL can achieve provable generalization performance on the natural distribution. Empirically, we demonstrate that VHL endows FL with drastically improved convergence speed and generalization performance. VHL is the first attempt towards using a virtual dataset to address data heterogeneity, offering new and effective means to FL.

Related papers

Client Selection in Federated Learning with Data Heterogeneity and Network Latencies [19.161254709653914]
Federated learning (FL) is a distributed machine learning paradigm where multiple clients conduct local training based on their private data, then the updated models are sent to a central server for global aggregation. In this paper, we propose two novel theoretically optimal client selection schemes that handle both these heterogeneities.
arXiv Detail & Related papers (2025-04-02T17:31:15Z)
Lightweight Industrial Cohorted Federated Learning for Heterogeneous Assets [0.0]
Federated Learning (FL) is the most widely adopted collaborative learning approach for training decentralized Machine Learning (ML) models. However, since great data similarity or homogeneity is taken for granted in all FL tasks, FL is still not specifically designed for the industrial setting. We propose a Lightweight Industrial Cohorted FL (LICFL) algorithm that uses model parameters for cohorting without any additional on-edge (clientlevel) computations and communications.
arXiv Detail & Related papers (2024-07-25T12:48:56Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm. It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities. It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z)
Fake It Till Make It: Federated Learning with Consensus-Oriented Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG) FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training. Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z)
PFL-GAN: When Client Heterogeneity Meets Generative Models in Personalized Federated Learning [55.930403371398114]
We propose a novel generative adversarial network (GAN) sharing and aggregation strategy for personalized learning (PFL) PFL-GAN addresses the client heterogeneity in different scenarios. More specially, we first learn the similarity among clients and then develop an weighted collaborative data aggregation. The empirical results through the rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.
arXiv Detail & Related papers (2023-08-23T22:38:35Z)
Federated Virtual Learning on Heterogeneous Data with Local-global Distillation [17.998623216905496]
Federated Virtual Learning on Heterogeneous Data with Local-Global Distillation (FedLGD) We propose a new method, called Federated Virtual Learning on Heterogeneous Data with Local-Global Distillation (FedLGD) Our method outperforms state-of-the-art heterogeneous FL algorithms under various settings with a very limited amount of distilled virtual data.
arXiv Detail & Related papers (2023-03-04T00:35:29Z)
Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants. Our observations are intuitive. Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z)
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint. We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG) Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.