Towards Dynamic Resource Allocation and Client Scheduling in Hierarchical Federated Learning: A Two-Phase Deep Reinforcement Learning Approach
- URL: http://arxiv.org/abs/2406.14910v1
- Date: Fri, 21 Jun 2024 07:01:23 GMT
- Title: Towards Dynamic Resource Allocation and Client Scheduling in Hierarchical Federated Learning: A Two-Phase Deep Reinforcement Learning Approach
- Authors: Xiaojing Chen, Zhenyuan Li, Wei Ni, Xin Wang, Shunqing Zhang, Yanzan Sun, Shugong Xu, Qingqi Pei,
- Abstract summary: Federated learning is a viable technique to train a shared machine learning model without sharing data.
This paper presents a new two-phase deep deterministic policy gradient (DDPG) framework to balance online the learning delay and model accuracy of an FL process.
- Score: 40.082601481580426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a viable technique to train a shared machine learning model without sharing data. Hierarchical FL (HFL) system has yet to be studied regrading its multiple levels of energy, computation, communication, and client scheduling, especially when it comes to clients relying on energy harvesting to power their operations. This paper presents a new two-phase deep deterministic policy gradient (DDPG) framework, referred to as ``TP-DDPG'', to balance online the learning delay and model accuracy of an FL process in an energy harvesting-powered HFL system. The key idea is that we divide optimization decisions into two groups, and employ DDPG to learn one group in the first phase, while interpreting the other group as part of the environment to provide rewards for training the DDPG in the second phase. Specifically, the DDPG learns the selection of participating clients, and their CPU configurations and the transmission powers. A new straggler-aware client association and bandwidth allocation (SCABA) algorithm efficiently optimizes the other decisions and evaluates the reward for the DDPG. Experiments demonstrate that with substantially reduced number of learnable parameters, the TP-DDPG can quickly converge to effective polices that can shorten the training time of HFL by 39.4% compared to its benchmarks, when the required test accuracy of HFL is 0.9.
Related papers
- FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - Scheduling and Communication Schemes for Decentralized Federated
Learning [0.31410859223862103]
A decentralized federated learning (DFL) model with the gradient descent (SGD) algorithm has been introduced.
Three scheduling policies for DFL have been proposed for communications between the clients and the parallel servers.
Results show that the proposed scheduling polices have an impact both on the speed of convergence and in the final global model.
arXiv Detail & Related papers (2023-11-27T17:35:28Z) - Client Orchestration and Cost-Efficient Joint Optimization for
NOMA-Enabled Hierarchical Federated Learning [55.49099125128281]
We propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation.
We show that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.
arXiv Detail & Related papers (2023-11-03T13:34:44Z) - Hierarchical Personalized Federated Learning Over Massive Mobile Edge
Computing Networks [95.39148209543175]
We propose hierarchical PFL (HPFL), an algorithm for deploying PFL over massive MEC networks.
HPFL combines the objectives of training loss minimization and round latency minimization while jointly determining the optimal bandwidth allocation.
arXiv Detail & Related papers (2023-03-19T06:00:05Z) - Online Hyperparameter Optimization for Class-Incremental Learning [99.70569355681174]
Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase.
An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge.
We propose an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori.
arXiv Detail & Related papers (2023-01-11T17:58:51Z) - GTFLAT: Game Theory Based Add-On For Empowering Federated Learning
Aggregation Techniques [0.3867363075280543]
GTFLAT, as a game theory-based add-on, addresses an important research question.
How can a federated learning algorithm achieve better performance and training efficiency by setting more effective adaptive weights for averaging in the model aggregation phase?
The results reveal that, on average, using GTFLAT increases the top-1 test accuracy by 1.38%, while it needs 21.06% fewer communication rounds to reach the accuracy.
arXiv Detail & Related papers (2022-12-08T06:39:51Z) - Semi-Synchronous Personalized Federated Learning over Mobile Edge
Networks [88.50555581186799]
We propose a semi-synchronous PFL algorithm, termed as Semi-Synchronous Personalized FederatedAveraging (PerFedS$2$), over mobile edge networks.
We derive an upper bound of the convergence rate of PerFedS2 in terms of the number of participants per global round and the number of rounds.
Experimental results verify the effectiveness of PerFedS2 in saving training time as well as guaranteeing the convergence of training loss.
arXiv Detail & Related papers (2022-09-27T02:12:43Z) - A Fair Federated Learning Framework With Reinforcement Learning [23.675056844328]
Federated learning (FL) is a paradigm where many clients collaboratively train a model under the coordination of a central server.
We propose a reinforcement learning framework, called PG-FFL, which automatically learns a policy to assign aggregation weights to clients.
We conduct extensive experiments over diverse datasets to verify the effectiveness of our framework.
arXiv Detail & Related papers (2022-05-26T15:10:16Z) - Behavior Mimics Distribution: Combining Individual and Group Behaviors
for Federated Learning [26.36851197666568]
Federated Learning (FL) has become an active and promising distributed machine learning paradigm.
Recent studies show that the performance of popular FL methods deteriorates dramatically due to the client drift caused by local updates.
This paper proposes a novel Federated Learning algorithm (called IGFL), which leverages both Individual and Group behaviors to mimic distribution.
arXiv Detail & Related papers (2021-06-23T10:42:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.