Accelerating Federated Learning in Heterogeneous Data and Computational
Environments
- URL: http://arxiv.org/abs/2008.11281v1
- Date: Tue, 25 Aug 2020 21:28:38 GMT
- Title: Accelerating Federated Learning in Heterogeneous Data and Computational
Environments
- Authors: Dimitris Stripelis and Jose Luis Ambite
- Abstract summary: We introduce a novel distributed validation weighting scheme (DVW), which evaluates the performance of a learner in the federation against a distributed validation set.
We empirically show that DVW results in better performance compared to established methods, such as FedAvg.
- Score: 0.7106986689736825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There are situations where data relevant to a machine learning problem are
distributed among multiple locations that cannot share the data due to
regulatory, competitiveness, or privacy reasons. For example, data present in
users' cellphones, manufacturing data of companies in a given industrial
sector, or medical records located at different hospitals. Moreover,
participating sites often have different data distributions and computational
capabilities. Federated Learning provides an approach to learn a joint model
over all the available data in these environments. In this paper, we introduce
a novel distributed validation weighting scheme (DVW), which evaluates the
performance of a learner in the federation against a distributed validation
set. Each learner reserves a small portion (e.g., 5%) of its local training
examples as a validation dataset and allows other learners models to be
evaluated against it. We empirically show that DVW results in better
performance compared to established methods, such as FedAvg, both under
synchronous and asynchronous communication protocols in data and
computationally heterogeneous environments.
Related papers
- Federated Impression for Learning with Distributed Heterogeneous Data [19.50235109938016]
Federated learning (FL) provides a paradigm that can learn from distributed datasets across clients without requiring them to share data.
In FL, sub-optimal convergence is common among data from different health centers due to the variety in data collection protocols and patient demographics across centers.
We propose FedImpres which alleviates catastrophic forgetting by restoring synthetic data that represents the global information as federated impression.
arXiv Detail & Related papers (2024-09-11T15:37:52Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Benchmarking FedAvg and FedCurv for Image Classification Tasks [1.376408511310322]
This paper focuses on the problem of statistical heterogeneity of the data in the same federated network.
Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv) have already been proposed.
As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.
arXiv Detail & Related papers (2023-03-31T10:13:01Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for
Federated Learning on Non-IID Data [69.0785021613868]
Federated learning is a distributed machine learning approach which enables a shared server model to learn by aggregating the locally-computed parameter updates with the training data from spatially-distributed client silos.
We propose the Federated Invariant Learning Consistency (FedILC) approach, which leverages the gradient covariance and the geometric mean of Hessians to capture both inter-silo and intra-silo consistencies.
This is relevant to various fields such as medical healthcare, computer vision, and the Internet of Things (IoT)
arXiv Detail & Related papers (2022-05-19T03:32:03Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - Jointly Learning from Decentralized (Federated) and Centralized Data to
Mitigate Distribution Shift [2.9965560298318468]
Federated Learning (FL) is an increasingly used paradigm where learning takes place collectively on edge devices.
Yet a distribution shift may still exist; the on-device training examples may lack for some data inputs expected to be encountered at inference time.
This paper proposes a way to mitigate this shift: selective usage of datacenter data, mixed in with FL.
arXiv Detail & Related papers (2021-11-23T20:51:24Z) - Federated Learning from Small Datasets [48.879172201462445]
Federated learning allows multiple parties to collaboratively train a joint model without sharing local data.
We propose a novel approach that intertwines model aggregations with permutations of local models.
The permutations expose each local model to a daisy chain of local datasets resulting in more efficient training in data-sparse domains.
arXiv Detail & Related papers (2021-10-07T13:49:23Z) - Decentralized federated learning of deep neural networks on non-iid data [0.6335848702857039]
We tackle the non-problem of learning a personalized deep learning model in a decentralized setting.
We propose a method named Performance-Based Neighbor Selection (PENS) where clients with similar data detect each other and cooperate.
PENS is able to achieve higher accuracies as compared to strong baselines.
arXiv Detail & Related papers (2021-07-18T19:05:44Z) - Semi-Synchronous Federated Learning [1.1168121941015012]
We introduce a novel Semi-Synchronous Federated Learning protocol that mixes local models periodically with minimal idle time and fast convergence.
We show through extensive experiments that our approach significantly outperforms previous work in data and computationally heterogeneous environments.
arXiv Detail & Related papers (2021-02-04T19:33:35Z) - A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources.
The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion.
This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.