Related papers: Resource Utilization Optimized Federated Learning

Resource Utilization Optimized Federated Learning

URL: http://arxiv.org/abs/2504.13850v1
Date: Mon, 10 Mar 2025 20:23:39 GMT
Title: Resource Utilization Optimized Federated Learning
Authors: Zihan Zhang, Leon Wong, Blesson Varghese,
Abstract summary: Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices.<n>This paper introduces FedOptima, a resource-optimized FL system designed to simultaneously minimize both types of idle time.
Score: 19.564340315424413
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices. However, FL systems have low resource utilization limiting their practical use in the real world. This inefficiency primarily arises from two types of idle time: (i) task dependency between the server and devices, and (ii) stragglers among heterogeneous devices. This paper introduces FedOptima, a resource-optimized FL system designed to simultaneously minimize both types of idle time; existing systems do not eliminate or reduce both at the same time. FedOptima offloads the training of certain layers of a neural network from a device to server using three innovations. First, devices operate independently of each other using asynchronous aggregation to eliminate straggler effects, and independently of the server by utilizing auxiliary networks to minimize idle time caused by task dependency. Second, the server performs centralized training using a task scheduler that ensures balanced contributions from all devices, improving model accuracy. Third, an efficient memory management mechanism on the server increases scalability of the number of participating devices. Four state-of-the-art offloading-based and asynchronous FL methods are chosen as baselines. Experimental results show that compared to the best results of the baselines on convolutional neural networks and transformers on multiple lab-based testbeds, FedOptima (i) achieves higher or comparable accuracy, (ii) accelerates training by 1.9x to 21.8x, (iii) reduces server and device idle time by up to 93.9% and 81.8%, respectively, and (iv) increases throughput by 1.1x to 2.0x.

Related papers

Ampere: Communication-Efficient and High-Accuracy Split Federated Learning [19.564340315424413]
A Federated Learning (FL) system collaboratively trains neural networks across devices and a server but is limited by significant on-device computation costs.<n>We propose Ampere, a novel collaborative training system that simultaneously minimizes on-device computation and device-server communication.<n>A lightweight auxiliary network generation method decouples training between the device and server, reducing frequent intermediate exchanges to a single transfer.
arXiv Detail & Related papers (2025-07-08T20:54:43Z)
DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs [11.966335602618933]
We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system.<n>Our experiments show that textscDoppler outperforms all baseline methods across tasks.
arXiv Detail & Related papers (2025-05-29T06:04:32Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
Efficient Federated Learning Using Dynamic Update and Adaptive Pruning with Momentum on Shared Server Data [59.6985168241067]
Federated Learning (FL) encounters two important problems, i.e., low training efficiency and limited computational resources. We propose a new FL framework, FedDUMAP, to leverage the shared insensitive data on the server and the distributed data in edge devices. Our proposed FL model, FedDUMAP, combines the three original techniques and has a significantly better performance compared with baseline approaches.
arXiv Detail & Related papers (2024-08-11T02:59:11Z)
Communication Efficient ConFederated Learning: An Event-Triggered SAGA Approach [67.27031215756121]
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data over various data sources. Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability. In this work, we consider a multi-server FL framework, referred to as emphConfederated Learning (CFL) in order to accommodate a larger number of users.
arXiv Detail & Related papers (2024-02-28T03:27:10Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
AEDFL: Efficient Asynchronous Decentralized Federated Learning with Heterogeneous Devices [61.66943750584406]
We propose an Asynchronous Efficient Decentralized FL framework, i.e., AEDFL, in heterogeneous environments. First, we propose an asynchronous FL system model with an efficient model aggregation method for improving the FL convergence. Second, we propose a dynamic staleness-aware model update approach to achieve superior accuracy. Third, we propose an adaptive sparse training method to reduce communication and computation costs without significant accuracy degradation.
arXiv Detail & Related papers (2023-12-18T05:18:17Z)
Time-sensitive Learning for Heterogeneous Federated Edge Intelligence [52.83633954857744]
We investigate real-time machine learning in a federated edge intelligence (FEI) system. FEI systems exhibit heterogenous communication and computational resource distribution. We propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model.
arXiv Detail & Related papers (2023-01-26T08:13:22Z)
PiPar: Pipeline Parallelism for Collaborative Machine Learning [16.131285496487678]
Collaborative machine learning (CML) techniques have been proposed to train deep learning models across multiple mobile devices and a server. CML techniques are privacy-preserving as a local model that is trained on each device instead of the raw data from the device is shared with the server. We identify idling resources on the server and devices due to sequential computation and communication as the principal cause of low resource utilization.
arXiv Detail & Related papers (2022-12-01T20:51:47Z)
FedLesScan: Mitigating Stragglers in Serverless Federated Learning [0.7388859384645262]
Federated Learning (FL) is a machine learning paradigm that enables the training of a shared global model across distributed clients. We propose FedLesScan, a novel clustering-based semi-asynchronous training strategy specifically tailored for serverless FL. We show that FedLesScan reduces training time and cost by an average of 8% and 20% respectively while utilizing clients better with an average increase in the effective update ratio of 17.75%.
arXiv Detail & Related papers (2022-11-10T18:17:41Z)
DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems [2.1506382989223782]
We study the problem of distributed training of neural networks (NNs) on devices with heterogeneous, limited, and time-varying availability of computational resources. We present an adaptive, resource-aware, on-device learning mechanism, DISTREAL, which is able to fully and efficiently utilize the available resources.
arXiv Detail & Related papers (2021-12-16T10:15:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.