Related papers: Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments

Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments

URL: http://arxiv.org/abs/2404.14033v1
Date: Mon, 22 Apr 2024 09:50:11 GMT
Title: Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments
Authors: Mohak Chadha, Alexander Jensen, Jianfeng Gu, Osama Abboud, Michael Gerndt,
Abstract summary: Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients. We present Apodotiko, a novel asynchronous training strategy designed for serverless FL. Our strategy incorporates a scoring mechanism that evaluates each client's hardware capacity and dataset size to intelligently prioritize and select clients for each training round.
Score: 40.06788591656159
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients while keeping the data decentralized. Recent works on designing systems for efficient FL have shown that utilizing serverless computing technologies, particularly Function-as-a-Service (FaaS) for FL, can enhance resource efficiency, reduce training costs, and alleviate the complex infrastructure management burden on data holders. However, current serverless FL systems still suffer from the presence of stragglers, i.e., slow clients that impede the collaborative training process. While strategies aimed at mitigating stragglers in these systems have been proposed, they overlook the diverse hardware resource configurations among FL clients. To this end, we present Apodotiko, a novel asynchronous training strategy designed for serverless FL. Our strategy incorporates a scoring mechanism that evaluates each client's hardware capacity and dataset size to intelligently prioritize and select clients for each training round, thereby minimizing the effects of stragglers on system performance. We comprehensively evaluate Apodotiko across diverse datasets, considering a mix of CPU and GPU clients, and compare its performance against five other FL training strategies. Results from our experiments demonstrate that Apodotiko outperforms other FL training strategies, achieving an average speedup of 2.75x and a maximum speedup of 7.03x. Furthermore, our strategy significantly reduces cold starts by a factor of four on average, demonstrating suitability in serverless environments.

Related papers

Seamless Integration: Sampling Strategies in Federated Learning Systems [0.0]
Federated Learning (FL) represents a paradigm shift in the field of machine learning. The seamless integration of new clients is imperative to sustain and enhance the performance of FL systems. This paper outlines strategies for effective client selection strategies and solutions for ensuring system scalability and stability.
arXiv Detail & Related papers (2024-08-18T17:16:49Z)
Embracing Federated Learning: Enabling Weak Client Participation via Partial Model Training [21.89214794178211]
In Federated Learning (FL), clients may have weak devices that cannot train the full model or even hold it in their memory space. We propose EmbracingFL, a general FL framework that allows all available clients to join the distributed training. Our empirical study shows that EmbracingFL consistently achieves high accuracy as like all clients are strong, outperforming the state-of-the-art width reduction methods.
arXiv Detail & Related papers (2024-06-21T13:19:29Z)
Training Heterogeneous Client Models using Knowledge Distillation in Serverless Federated Learning [0.5510212613486574]
Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients. Recent works on designing systems for efficient FL have shown that utilizing serverless computing technologies can enhance resource efficiency, reduce training costs, and alleviate the complex infrastructure management burden on data holders.
arXiv Detail & Related papers (2024-02-11T20:15:52Z)
FLrce: Resource-Efficient Federated Learning with Early-Stopping Strategy [7.963276533979389]
Federated Learning (FL) achieves great popularity in the Internet of Things (IoT) We present FLrce, an efficient FL framework with a relationship-based client selection and early-stopping strategy. Experiment results show that, compared with existing efficient FL frameworks, FLrce improves the computation and communication efficiency by at least 30% and 43% respectively.
arXiv Detail & Related papers (2023-10-15T10:13:44Z)
FedLesScan: Mitigating Stragglers in Serverless Federated Learning [0.7388859384645262]
Federated Learning (FL) is a machine learning paradigm that enables the training of a shared global model across distributed clients. We propose FedLesScan, a novel clustering-based semi-asynchronous training strategy specifically tailored for serverless FL. We show that FedLesScan reduces training time and cost by an average of 8% and 20% respectively while utilizing clients better with an average increase in the effective update ratio of 17.75%.
arXiv Detail & Related papers (2022-11-10T18:17:41Z)
FL Games: A Federated Learning Framework for Distribution Shifts [71.98708418753786]
Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server. We propose FL GAMES, a game-theoretic framework for federated learning that learns causal features that are invariant across clients.
arXiv Detail & Related papers (2022-10-31T22:59:03Z)
Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices. The impact of on-device storage on the performance of FL is still not explored. In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z)
Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM [62.62684911017472]
Federated learning (FL) enables devices to jointly train shared models while keeping the training data local for privacy purposes. We introduce a VFL framework with multiple heads (VIM), which takes the separate contribution of each client into account. VIM achieves significantly higher performance and faster convergence compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-20T23:14:33Z)
Dynamic Attention-based Communication-Efficient Federated Learning [85.18941440826309]
Federated learning (FL) offers a solution to train a global machine learning model. FL suffers performance degradation when client data distribution is non-IID. We propose a new adaptive training algorithm $textttAdaFL$ to combat this degradation.
arXiv Detail & Related papers (2021-08-12T14:18:05Z)
A Framework for Energy and Carbon Footprint Analysis of Distributed and Federated Edge Learning [48.63610479916003]
This article breaks down and analyzes the main factors that influence the environmental footprint of distributed learning policies. It models both vanilla and decentralized FL policies driven by consensus. Results show that FL allows remarkable end-to-end energy savings (30%-40%) for wireless systems characterized by low bit/Joule efficiency.
arXiv Detail & Related papers (2021-03-18T16:04:42Z)
Blockchain Assisted Decentralized Federated Learning (BLADE-FL): Performance Analysis and Resource Allocation [119.19061102064497]
We propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL) In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round. We explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients.
arXiv Detail & Related papers (2021-01-18T07:19:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.