FedFetch: Faster Federated Learning with Adaptive Downstream Prefetching
- URL: http://arxiv.org/abs/2504.15366v1
- Date: Mon, 21 Apr 2025 18:17:05 GMT
- Title: FedFetch: Faster Federated Learning with Adaptive Downstream Prefetching
- Authors: Qifan Yan, Andrew Liu, Shiqi He, Mathias Lécuyer, Ivan Beschastnikh,
- Abstract summary: Federated learning (FL) is a machine learning paradigm that facilitates massively distributed model training with end-user data on edge devices directed by a central server.<n>We introduce FedFetch, a strategy to mitigate the download time overhead caused by combining client sampling and compression techniques.<n>We empirically show that adding FedFetch to communication efficient FL techniques reduces end-to-end training time by 1.26$times$ and download time by 4.49$times$ across compression techniques with heterogeneous client settings.
- Score: 7.264549907717153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a machine learning paradigm that facilitates massively distributed model training with end-user data on edge devices directed by a central server. However, the large number of heterogeneous clients in FL deployments leads to a communication bottleneck between the server and the clients. This bottleneck is made worse by straggling clients, any one of which will further slow down training. To tackle these challenges, researchers have proposed techniques like client sampling and update compression. These techniques work well in isolation but combine poorly in the downstream, server-to-client direction. This is because unselected clients have outdated local model states and need to synchronize these states with the server first. We introduce FedFetch, a strategy to mitigate the download time overhead caused by combining client sampling and compression techniques. FedFetch achieves this with an efficient prefetch schedule for clients to prefetch model states multiple rounds before a stated training round. We empirically show that adding FedFetch to communication efficient FL techniques reduces end-to-end training time by 1.26$\times$ and download time by 4.49$\times$ across compression techniques with heterogeneous client settings. Our implementation is available at https://github.com/DistributedML/FedFetch
Related papers
- The More is not the Merrier: Investigating the Effect of Client Size on Federated Learning [1.6258045262919332]
Federated Learning (FL) has been introduced as a way to keep data local to clients while training a shared machine learning model.
In this paper, we focus on the widely used FedAvg algorithm to explore the effect of the number of clients in FL.
We propose a method called Knowledgeable Client Insertion (KCI) that introduces a very small number of knowledgeable clients to the MEC setting.
arXiv Detail & Related papers (2025-04-11T02:01:38Z) - Sparse-ProxSkip: Accelerated Sparse-to-Sparse Training in Federated Learning [56.21666819468249]
In Federated Learning (FL), both client resource constraints and communication costs pose major problems for training large models.<n>Recent work has shown that local training provably improves communication complexity through acceleration.<n>We introduce Sparse-ProxSkip, addressing the issue and implementing the efficient technique of Straight-Through Estorimat pruning into sparse training.
arXiv Detail & Related papers (2024-05-31T05:21:12Z) - Communication Efficient ConFederated Learning: An Event-Triggered SAGA
Approach [67.27031215756121]
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data over various data sources.
Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability.
In this work, we consider a multi-server FL framework, referred to as emphConfederated Learning (CFL) in order to accommodate a larger number of users.
arXiv Detail & Related papers (2024-02-28T03:27:10Z) - Adaptive Compression in Federated Learning via Side Information [28.401993810064255]
We propose a framework that requires approximately $D_KL(q_phi(n) p_theta$ bits of communication.
We show that our method can be integrated into many existing compression frameworks to attain the same (and often higher) test accuracy with up to $82$ times smaller than the prior work -- corresponding to 2,650 times overall compression.
arXiv Detail & Related papers (2023-06-22T01:29:50Z) - BAFFLE: A Baseline of Backpropagation-Free Federated Learning [71.09425114547055]
Federated learning (FL) is a general principle for decentralized clients to train a server model collectively without sharing local data.
We develop backpropagation-free federated learning, dubbed BAFFLE, in which backpropagation is replaced by multiple forward processes to estimate gradients.
BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments.
arXiv Detail & Related papers (2023-01-28T13:34:36Z) - MDA: Availability-Aware Federated Learning Client Selection [1.9422756778075616]
This study focuses on an FL setting called cross-device FL, which trains based on a large number of clients.
In vanilla FL, clients are selected randomly, which results in an acceptable accuracy but is not ideal from the overall training time perspective.
New client selection techniques have been proposed to improve the training time by considering individual clients' resources and speed.
arXiv Detail & Related papers (2022-11-25T22:18:24Z) - Optimizing Server-side Aggregation For Robust Federated Learning via
Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems.
We propose SmartFL, a generic approach that optimize the server-side aggregation process.
We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z) - Aergia: Leveraging Heterogeneity in Federated Learning Systems [5.0650178943079]
Federated Learning (FL) relies on clients to update a global model using their local datasets.
Aergia is a novel approach where slow clients freeze the part of their model that is the most computationally intensive to train.
Aergia significantly reduces the training time under heterogeneous settings by up to 27% and 53% compared to FedAvg and TiFL, respectively.
arXiv Detail & Related papers (2022-10-12T12:59:18Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning.
ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models.
Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z) - FEDZIP: A Compression Framework for Communication-Efficient Federated
Learning [2.334824705384299]
Federated Learning is an implementation of decentralized machine learning for wireless devices.
It assigns the learning process independently to each client.
We propose a novel framework, FedZip, that significantly decreases the size of updates while transferring weights from the deep learning model between clients and their servers.
arXiv Detail & Related papers (2021-02-02T16:33:44Z) - Blockchain Assisted Decentralized Federated Learning (BLADE-FL):
Performance Analysis and Resource Allocation [119.19061102064497]
We propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL)
In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round.
We explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients.
arXiv Detail & Related papers (2021-01-18T07:19:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.