TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with
Adaptive Partial Training
- URL: http://arxiv.org/abs/2304.06947v1
- Date: Fri, 14 Apr 2023 06:26:08 GMT
- Title: TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with
Adaptive Partial Training
- Authors: Tuo Zhang, Lei Gao, Sunwoo Lee, Mi Zhang and Salman Avestimehr
- Abstract summary: TimelyFL is a heterogeneous-aware asynchronous Federated Learning framework with adaptive partial training.
We show that TimelyFL improves participation rate by 21.13%, harvests 1.28x - 2.89x more efficiency on convergence rate, and provides a 6.25% increment on test accuracy.
- Score: 17.84692242938424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In cross-device Federated Learning (FL) environments, scaling synchronous FL
methods is challenging as stragglers hinder the training process. Moreover, the
availability of each client to join the training is highly variable over time
due to system heterogeneities and intermittent connectivity. Recent
asynchronous FL methods (e.g., FedBuff) have been proposed to overcome these
issues by allowing slower users to continue their work on local training based
on stale models and to contribute to aggregation when ready. However, we show
empirically that this method can lead to a substantial drop in training
accuracy as well as a slower convergence rate. The primary reason is that
fast-speed devices contribute to many more rounds of aggregation while others
join more intermittently or not at all, and with stale model updates. To
overcome this barrier, we propose TimelyFL, a heterogeneity-aware asynchronous
FL framework with adaptive partial training. During the training, TimelyFL
adjusts the local training workload based on the real-time resource
capabilities of each client, aiming to allow more available clients to join in
the global update without staleness. We demonstrate the performance benefits of
TimelyFL by conducting extensive experiments on various datasets (e.g.,
CIFAR-10, Google Speech, and Reddit) and models (e.g., ResNet20, VGG11, and
ALBERT). In comparison with the state-of-the-art (i.e., FedBuff), our
evaluations reveal that TimelyFL improves participation rate by 21.13%,
harvests 1.28x - 2.89x more efficiency on convergence rate, and provides a
6.25% increment on test accuracy.
Related papers
- FedAST: Federated Asynchronous Simultaneous Training [27.492821176616815]
Federated Learning (FL) enables devices or clients to collaboratively train machine learning (ML) models without sharing their private data.
Much of the existing work in FL focuses on efficiently learning a model for a single task.
In this paper, we propose simultaneous training of multiple FL models using a common set of datasets.
arXiv Detail & Related papers (2024-06-01T05:14:20Z) - Prune at the Clients, Not the Server: Accelerated Sparse Training in Federated Learning [56.21666819468249]
Resource constraints of clients and communication costs pose major problems for training large models in Federated Learning.
We introduce Sparse-ProxSkip, which combines training and acceleration in a sparse setting.
We demonstrate the good performance of Sparse-ProxSkip in extensive experiments.
arXiv Detail & Related papers (2024-05-31T05:21:12Z) - Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - Achieving Linear Speedup in Asynchronous Federated Learning with
Heterogeneous Clients [30.135431295658343]
Federated learning (FL) aims to learn a common global model without exchanging or transferring the data that are stored locally at different clients.
In this paper, we propose an efficient federated learning (AFL) framework called DeFedAvg.
DeFedAvg is the first AFL algorithm that achieves the desirable linear speedup property, which indicates its high scalability.
arXiv Detail & Related papers (2024-02-17T05:22:46Z) - Enhancing Convergence in Federated Learning: A Contribution-Aware
Asynchronous Approach [0.0]
Federated Learning (FL) is a distributed machine learning paradigm that allows clients to train models on their data while preserving their privacy.
FL algorithms, such as Federated Averaging (FedAvg) and its variants, have been shown to converge well in many scenarios.
However, these methods require clients to upload their local updates to the server in a synchronous manner, which can be slow and unreliable in realistic FL settings.
We propose a contribution-aware asynchronous FL method that takes into account the staleness and statistical heterogeneity of the received updates.
arXiv Detail & Related papers (2024-02-16T12:10:53Z) - Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - Speed Up Federated Learning in Heterogeneous Environment: A Dynamic
Tiering Approach [5.504000607257414]
Federated learning (FL) enables collaboratively training a model while keeping the training data decentralized and private.
One significant impediment to training a model using FL, especially large models, is the resource constraints of devices with heterogeneous computation and communication capacities as well as varying task sizes.
We propose the Dynamic Tiering-based Federated Learning (DTFL) system where slower clients dynamically offload part of the model to the server to alleviate resource constraints and speed up training.
arXiv Detail & Related papers (2023-12-09T19:09:19Z) - FL Games: A Federated Learning Framework for Distribution Shifts [71.98708418753786]
Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server.
We propose FL GAMES, a game-theoretic framework for federated learning that learns causal features that are invariant across clients.
arXiv Detail & Related papers (2022-10-31T22:59:03Z) - Semi-Synchronous Personalized Federated Learning over Mobile Edge
Networks [88.50555581186799]
We propose a semi-synchronous PFL algorithm, termed as Semi-Synchronous Personalized FederatedAveraging (PerFedS$2$), over mobile edge networks.
We derive an upper bound of the convergence rate of PerFedS2 in terms of the number of participants per global round and the number of rounds.
Experimental results verify the effectiveness of PerFedS2 in saving training time as well as guaranteeing the convergence of training loss.
arXiv Detail & Related papers (2022-09-27T02:12:43Z) - Pisces: Efficient Federated Learning via Guided Asynchronous Training [42.46549526793953]
Federated learning (FL) is typically performed in a synchronous parallel manner, where the involvement of a slow client delays a training iteration.
Current FL employ a participant selection strategy to select fast clients with quality data in each iteration.
We present Pisces, an asynchronous FL system with intelligent participant selection and model aggregation for possible training.
arXiv Detail & Related papers (2022-06-18T18:25:30Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.