Non-IID data and Continual Learning processes in Federated Learning: A
long road ahead
- URL: http://arxiv.org/abs/2111.13394v1
- Date: Fri, 26 Nov 2021 09:57:11 GMT
- Title: Non-IID data and Continual Learning processes in Federated Learning: A
long road ahead
- Authors: Marcos F. Criado, Fernando E. Casado, Roberto Iglesias, Carlos V.
Regueiro and Sen\'en Barro
- Abstract summary: Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private.
In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it.
At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
- Score: 58.720142291102135
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Federated Learning is a novel framework that allows multiple devices or
institutions to train a machine learning model collaboratively while preserving
their data private. This decentralized approach is prone to suffer the
consequences of data statistical heterogeneity, both across the different
entities and over time, which may lead to a lack of convergence. To avoid such
issues, different methods have been proposed in the past few years. However,
data may be heterogeneous in lots of different ways, and current proposals do
not always determine the kind of heterogeneity they are considering. In this
work, we formally classify data statistical heterogeneity and review the most
remarkable learning strategies that are able to face it. At the same time, we
introduce approaches from other machine learning frameworks, such as Continual
Learning, that also deal with data heterogeneity and could be easily adapted to
the Federated Learning settings.
Related papers
- Non-IID data in Federated Learning: A Systematic Review with Taxonomy, Metrics, Methods, Frameworks and Future Directions [2.9434966603161072]
This systematic review aims to fill a gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics.
We describe popular solutions to address non-IID data and standardized frameworks employed in Federated Learning with heterogeneous data.
arXiv Detail & Related papers (2024-11-19T09:53:28Z) - Accelerated Stochastic ExtraGradient: Mixing Hessian and Gradient Similarity to Reduce Communication in Distributed and Federated Learning [50.382793324572845]
Distributed computing involves communication between devices, which requires solving two key problems: efficiency and privacy.
In this paper, we analyze a new method that incorporates the ideas of using data similarity and clients sampling.
To address privacy concerns, we apply the technique of additional noise and analyze its impact on the convergence of the proposed method.
arXiv Detail & Related papers (2024-09-22T00:49:10Z) - A review on different techniques used to combat the non-IID and
heterogeneous nature of data in FL [0.0]
Federated Learning (FL) is a machine-learning approach enabling collaborative model training across multiple edge devices.
The significance of FL is particularly pronounced in industries such as healthcare and finance, where data privacy holds paramount importance.
This report delves into the issues arising from non-IID and heterogeneous data and explores current algorithms designed to address these challenges.
arXiv Detail & Related papers (2024-01-01T16:34:00Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Federated Learning on Non-IID Data: A Survey [11.431837357827396]
Federated learning is an emerging distributed machine learning framework for privacy preservation.
Models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode.
arXiv Detail & Related papers (2021-06-12T19:45:35Z) - Rethinking Architecture Design for Tackling Data Heterogeneity in
Federated Learning [53.73083199055093]
We show that attention-based architectures (e.g., Transformers) are fairly robust to distribution shifts.
Our experiments show that replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices.
arXiv Detail & Related papers (2021-06-10T21:04:18Z) - Enhancing ensemble learning and transfer learning in multimodal data
analysis by adaptive dimensionality reduction [10.646114896709717]
In multimodal data analysis, not all observations would show the same level of reliability or information quality.
We propose an adaptive approach for dimensionality reduction to overcome this issue.
We test our approach on multimodal datasets acquired in diverse research fields.
arXiv Detail & Related papers (2021-05-08T11:53:12Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - Additively Homomorphical Encryption based Deep Neural Network for
Asymmetrically Collaborative Machine Learning [12.689643742151516]
preserving machine learning creates a constraint which limits further applications in finance sectors.
We propose a new practical scheme of collaborative machine learning that one party owns data, but another party owns labels only.
Our experiments on different datasets demonstrate not only stable training without accuracy, but also more than 100 times speedup.
arXiv Detail & Related papers (2020-07-14T06:43:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.