Federated Learning over Hierarchical Wireless Networks: Training Latency Minimization via Submodel Partitioning
- URL: http://arxiv.org/abs/2310.17890v2
- Date: Mon, 27 Jan 2025 00:23:13 GMT
- Title: Federated Learning over Hierarchical Wireless Networks: Training Latency Minimization via Submodel Partitioning
- Authors: Wenzhi Fang, Dong-Jun Han, Christopher G. Brinton,
- Abstract summary: Hierarchical independent submodel training (HIST) is a new FL methodology that aims to address these issues in hierarchical cloud-edge-client networks.
We demonstrate how HIST can be augmented with over-the-air computation (AirComp) to further enhance the efficiency of the model aggregation over the edge cells.
- Score: 15.311309249848739
- License:
- Abstract: Hierarchical federated learning (HFL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL). However, HFL still imposes significant computation, communication, and storage burdens on the edge, especially when training a large-scale model over resource-constrained wireless devices. In this paper, we propose hierarchical independent submodel training (HIST), a new FL methodology that aims to address these issues in hierarchical cloud-edge-client networks. The key idea behind HIST is to divide the global model into disjoint partitions (or submodels) per round so that each group of clients (i.e., cells) is responsible for training only one partition of the model. We characterize the convergence behavior of HIST under mild assumptions, showing the impacts of several key attributes (e.g., submodel sizes, number of cells, edge and global aggregation frequencies) on the rate and stationarity gap. Building upon the theoretical results, we propose a submodel partitioning strategy to minimize the training latency depending on network resource availability and a target learning performance guarantee. We then demonstrate how HIST can be augmented with over-the-air computation (AirComp) to further enhance the efficiency of the model aggregation over the edge cells. Through numerical evaluations, we verify that HIST is able to save training time and communication costs by wide margins while achieving comparable accuracy as conventional HFL. Moreover, our experiments demonstrate that AirComp-assisted HIST provides further improvements in training latency.
Related papers
- Heterogeneity-Aware Resource Allocation and Topology Design for Hierarchical Federated Edge Learning [9.900317349372383]
Federated Learning (FL) provides a privacy-preserving framework for training machine learning models on mobile edge devices.
Traditional FL algorithms, e.g., FedAvg, impose a heavy communication workload on these devices.
We propose a two-tier HFEL system, where edge devices are connected to edge servers and edge servers are interconnected through peer-to-peer (P2P) edge backhauls.
Our goal is to enhance the training efficiency of the HFEL system through strategic resource allocation and topology design.
arXiv Detail & Related papers (2024-09-29T01:48:04Z) - Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - NeFL: Nested Model Scaling for Federated Learning with System Heterogeneous Clients [44.89061671579694]
Federated learning (FL) enables distributed training while preserving data privacy, but stragglers-slow or incapable clients-can significantly slow down the total training time and degrade performance.
We propose nested federated learning (NeFL), a framework that efficiently divides deep neural networks into submodels using both depthwise and widthwise scaling.
NeFL achieves performance gain, especially for the worst-case submodel compared to baseline approaches.
arXiv Detail & Related papers (2023-08-15T13:29:14Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - Time-sensitive Learning for Heterogeneous Federated Edge Intelligence [52.83633954857744]
We investigate real-time machine learning in a federated edge intelligence (FEI) system.
FEI systems exhibit heterogenous communication and computational resource distribution.
We propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model.
arXiv Detail & Related papers (2023-01-26T08:13:22Z) - Stochastic Coded Federated Learning: Theoretical Analysis and Incentive
Mechanism Design [18.675244280002428]
We propose a novel FL framework named coded federated learning (SCFL) that leverages coded computing techniques.
In SCFL, each edge device uploads a privacy-preserving coded dataset to the server, which is generated by adding noise to the projected local dataset.
We show that SCFL learns a better model within the given time and achieves a better privacy-performance tradeoff than the baseline methods.
arXiv Detail & Related papers (2022-11-08T09:58:36Z) - Supernet Training for Federated Image Classification under System
Heterogeneity [15.2292571922932]
In this work, we propose a novel framework to consider both scenarios, namely Federation of Supernet Training (FedSup)
It is inspired by how averaging parameters in the model aggregation stage of Federated Learning (FL) is similar to weight-sharing in supernet training.
Under our framework, we present an efficient algorithm (E-FedSup) by sending the sub-model to clients in the broadcast stage for reducing communication costs and training overhead.
arXiv Detail & Related papers (2022-06-03T02:21:01Z) - Efficient Split-Mix Federated Learning for On-Demand and In-Situ
Customization [107.72786199113183]
Federated learning (FL) provides a distributed learning framework for multiple participants to collaborate learning without sharing raw data.
In this paper, we propose a novel Split-Mix FL strategy for heterogeneous participants that, once training is done, provides in-situ customization of model sizes and robustness.
arXiv Detail & Related papers (2022-03-18T04:58:34Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.