Federated Learning on Non-IID Data: A Survey
- URL: http://arxiv.org/abs/2106.06843v1
- Date: Sat, 12 Jun 2021 19:45:35 GMT
- Title: Federated Learning on Non-IID Data: A Survey
- Authors: Hangyu Zhu, Jinjin Xu, Shiqing Liu and Yaochu Jin
- Abstract summary: Federated learning is an emerging distributed machine learning framework for privacy preservation.
Models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode.
- Score: 11.431837357827396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning is an emerging distributed machine learning framework for
privacy preservation. However, models trained in federated learning usually
have worse performance than those trained in the standard centralized learning
mode, especially when the training data are not independent and identically
distributed (Non-IID) on the local devices. In this survey, we pro-vide a
detailed analysis of the influence of Non-IID data on both parametric and
non-parametric machine learning models in both horizontal and vertical
federated learning. In addition, cur-rent research work on handling challenges
of Non-IID data in federated learning are reviewed, and both advantages and
disadvantages of these approaches are discussed. Finally, we suggest several
future research directions before concluding the paper.
Related papers
- Understanding Federated Learning from IID to Non-IID dataset: An Experimental Study [5.680416078423551]
federated learning (FL) has emerged as a promising approach for training machine learning models across decentralized data sources without sharing raw data.
A significant challenge in FL is that client data are often non-IID (non-independent and identically distributed), leading to reduced performance compared to centralized learning.
arXiv Detail & Related papers (2025-01-31T21:58:15Z) - Non-IID data in Federated Learning: A Survey with Taxonomy, Metrics, Methods, Frameworks and Future Directions [2.9434966603161072]
Federated Learning (FL) enables users to collectively train ML models without sharing private data.
FL struggles when data across clients is not independent and identically distributed (non-IID) data.
This technical survey aims to fill that gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics.
arXiv Detail & Related papers (2024-11-19T09:53:28Z) - Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - A review on different techniques used to combat the non-IID and
heterogeneous nature of data in FL [0.0]
Federated Learning (FL) is a machine-learning approach enabling collaborative model training across multiple edge devices.
The significance of FL is particularly pronounced in industries such as healthcare and finance, where data privacy holds paramount importance.
This report delves into the issues arising from non-IID and heterogeneous data and explores current algorithms designed to address these challenges.
arXiv Detail & Related papers (2024-01-01T16:34:00Z) - Exploring Federated Unlearning: Analysis, Comparison, and Insights [101.64910079905566]
federated unlearning enables the selective removal of data from models trained in federated systems.
This paper examines existing federated unlearning approaches, examining their algorithmic efficiency, impact on model accuracy, and effectiveness in preserving privacy.
We propose the OpenFederatedUnlearning framework, a unified benchmark for evaluating federated unlearning methods.
arXiv Detail & Related papers (2023-10-30T01:34:33Z) - Towards Federated Long-Tailed Learning [76.50892783088702]
Data privacy and class imbalance are the norm rather than the exception in many machine learning tasks.
Recent attempts have been launched to, on one side, address the problem of learning from pervasive private data, and on the other side, learn from long-tailed data.
This paper focuses on learning with long-tailed (LT) data distributions under the context of the popular privacy-preserved federated learning (FL) framework.
arXiv Detail & Related papers (2022-06-30T02:34:22Z) - FEDIC: Federated Learning on Non-IID and Long-Tailed Data via Calibrated
Distillation [54.2658887073461]
Dealing with non-IID data is one of the most challenging problems for federated learning.
This paper studies the joint problem of non-IID and long-tailed data in federated learning and proposes a corresponding solution called Federated Ensemble Distillation with Imbalance (FEDIC)
FEDIC uses model ensemble to take advantage of the diversity of models trained on non-IID data.
arXiv Detail & Related papers (2022-04-30T06:17:36Z) - Non-IID data and Continual Learning processes in Federated Learning: A
long road ahead [58.720142291102135]
Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private.
In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it.
At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
arXiv Detail & Related papers (2021-11-26T09:57:11Z) - Federated Learning on Non-IID Data Silos: An Experimental Study [34.28108345251376]
Training data have been increasingly fragmented, forming distributed databases of multiple data silos.
In this paper, we propose comprehensive data partitioning strategies to cover the typical non-IID data cases.
We find that non-IID does bring significant challenges in learning accuracy of FL algorithms, and none of the existing state-of-the-art FL algorithms outperforms others in all cases.
arXiv Detail & Related papers (2021-02-03T14:29:09Z) - Decentralized Federated Learning Preserves Model and Data Privacy [77.454688257702]
We propose a fully decentralized approach, which allows to share knowledge between trained models.
Students are trained on the output of their teachers via synthetically generated input data.
The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher.
arXiv Detail & Related papers (2021-02-01T14:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.