Straggler-Resilient Federated Learning: Leveraging the Interplay Between
Statistical Accuracy and System Heterogeneity
- URL: http://arxiv.org/abs/2012.14453v1
- Date: Mon, 28 Dec 2020 19:21:14 GMT
- Title: Straggler-Resilient Federated Learning: Leveraging the Interplay Between
Statistical Accuracy and System Heterogeneity
- Authors: Amirhossein Reisizadeh, Isidoros Tziotis, Hamed Hassani, Aryan
Mokhtari, Ramtin Pedarsani
- Abstract summary: Federated learning involves learning from data samples distributed across a network of clients while the data remains local.
In this paper, we propose a novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients' data to adaptively select the clients in order to speed up the learning procedure.
- Score: 57.275753974812666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning is a novel paradigm that involves learning from data
samples distributed across a large network of clients while the data remains
local. It is, however, known that federated learning is prone to multiple
system challenges including system heterogeneity where clients have different
computation and communication capabilities. Such heterogeneity in clients'
computation speeds has a negative effect on the scalability of federated
learning algorithms and causes significant slow-down in their runtime due to
the existence of stragglers. In this paper, we propose a novel
straggler-resilient federated learning method that incorporates statistical
characteristics of the clients' data to adaptively select the clients in order
to speed up the learning procedure. The key idea of our algorithm is to start
the training procedure with faster nodes and gradually involve the slower nodes
in the model training once the statistical accuracy of the data corresponding
to the current participating nodes is reached. The proposed approach reduces
the overall runtime required to achieve the statistical accuracy of data of all
nodes, as the solution for each stage is close to the solution of the
subsequent stage with more samples and can be used as a warm-start. Our
theoretical results characterize the speedup gain in comparison to standard
federated benchmarks for strongly convex objectives, and our numerical
experiments also demonstrate significant speedups in wall-clock time of our
straggler-resilient method compared to federated learning benchmarks.
Related papers
- Aiding Global Convergence in Federated Learning via Local Perturbation and Mutual Similarity Information [6.767885381740953]
Federated learning has emerged as a distributed optimization paradigm.
We propose a novel modified framework wherein each client locally performs a perturbed gradient step.
We show that our algorithm speeds convergence up to a margin of 30 global rounds compared with FedAvg.
arXiv Detail & Related papers (2024-10-07T23:14:05Z) - Federated Learning based on Pruning and Recovery [0.0]
This framework integrates asynchronous learning algorithms and pruning techniques.
It addresses the inefficiencies of traditional federated learning algorithms in scenarios involving heterogeneous devices.
It also tackles the staleness issue and inadequate training of certain clients in asynchronous algorithms.
arXiv Detail & Related papers (2024-03-16T14:35:03Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Effectively Heterogeneous Federated Learning: A Pairing and Split
Learning Based Approach [16.093068118849246]
This paper presents a novel split federated learning (SFL) framework that pairs clients with different computational resources.
A greedy algorithm is proposed by reconstructing the optimization of training latency as a graph edge selection problem.
Simulation results show the proposed method can significantly improve the FL training speed and achieve high performance.
arXiv Detail & Related papers (2023-08-26T11:10:54Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data.
We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks.
We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.