Aergia: Leveraging Heterogeneity in Federated Learning Systems
- URL: http://arxiv.org/abs/2210.06154v1
- Date: Wed, 12 Oct 2022 12:59:18 GMT
- Title: Aergia: Leveraging Heterogeneity in Federated Learning Systems
- Authors: Bart Cox, Lydia Y. Chen, J\'er\'emie Decouchant
- Abstract summary: Federated Learning (FL) relies on clients to update a global model using their local datasets.
Aergia is a novel approach where slow clients freeze the part of their model that is the most computationally intensive to train.
Aergia significantly reduces the training time under heterogeneous settings by up to 27% and 53% compared to FedAvg and TiFL, respectively.
- Score: 5.0650178943079
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Learning (FL) is a popular approach for distributed deep learning
that prevents the pooling of large amounts of data in a central server. FL
relies on clients to update a global model using their local datasets.
Classical FL algorithms use a central federator that, for each training round,
waits for all clients to send their model updates before aggregating them. In
practical deployments, clients might have different computing powers and
network capabilities, which might lead slow clients to become performance
bottlenecks. Previous works have suggested to use a deadline for each learning
round so that the federator ignores the late updates of slow clients, or so
that clients send partially trained models before the deadline. To speed up the
training process, we instead propose Aergia, a novel approach where slow
clients (i) freeze the part of their model that is the most computationally
intensive to train; (ii) train the unfrozen part of their model; and (iii)
offload the training of the frozen part of their model to a faster client that
trains it using its own dataset. The offloading decisions are orchestrated by
the federator based on the training speed that clients report and on the
similarities between their datasets, which are privately evaluated thanks to a
trusted execution environment. We show through extensive experiments that
Aergia maintains high accuracy and significantly reduces the training time
under heterogeneous settings by up to 27% and 53% compared to FedAvg and TiFL,
respectively.
Related papers
- Prune at the Clients, Not the Server: Accelerated Sparse Training in Federated Learning [56.21666819468249]
Resource constraints of clients and communication costs pose major problems for training large models in Federated Learning.
We introduce Sparse-ProxSkip, which combines training and acceleration in a sparse setting.
We demonstrate the good performance of Sparse-ProxSkip in extensive experiments.
arXiv Detail & Related papers (2024-05-31T05:21:12Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Achieving Linear Speedup in Asynchronous Federated Learning with
Heterogeneous Clients [30.135431295658343]
Federated learning (FL) aims to learn a common global model without exchanging or transferring the data that are stored locally at different clients.
In this paper, we propose an efficient federated learning (AFL) framework called DeFedAvg.
DeFedAvg is the first AFL algorithm that achieves the desirable linear speedup property, which indicates its high scalability.
arXiv Detail & Related papers (2024-02-17T05:22:46Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Federated Learning for Semantic Parsing: Task Formulation, Evaluation
Setup, New Algorithms [29.636944156801327]
Multiple clients collaboratively train one global model without sharing their semantic parsing data.
Lorar adjusts each client's contribution to the global model update based on its training loss reduction during each round.
Clients with smaller datasets enjoy larger performance gains.
arXiv Detail & Related papers (2023-05-26T19:25:49Z) - Faster Federated Learning with Decaying Number of Local SGD Steps [23.447883712141422]
InNIST Learning (FL) devices collaboratively train a machine learning model without sharing their private data with a central or with other clients.
In this work we propose $K$ as training progresses, which can jointly improve the final performance of FL model.
arXiv Detail & Related papers (2023-05-16T17:36:34Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Latency Aware Semi-synchronous Client Selection and Model Aggregation
for Wireless Federated Learning [0.6882042556551609]
Federated learning (FL) is a collaborative machine learning framework that requires different clients (e.g., Internet of Things devices) to participate in the machine learning model training process.
Traditional FL process may suffer from the straggler problem in heterogeneous client settings.
We propose a Semisynchronous-client Selection and mOdel aggregation aggregation for federated learNing (LESSON) method that allows all the clients to participate in the whole FL process but with different frequencies.
arXiv Detail & Related papers (2022-10-19T05:59:22Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Blockchain Assisted Decentralized Federated Learning (BLADE-FL):
Performance Analysis and Resource Allocation [119.19061102064497]
We propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL)
In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round.
We explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients.
arXiv Detail & Related papers (2021-01-18T07:19:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.