SHeRL-FL: When Representation Learning Meets Split Learning in Hierarchical Federated Learning
- URL: http://arxiv.org/abs/2508.08339v1
- Date: Mon, 11 Aug 2025 04:13:56 GMT
- Title: SHeRL-FL: When Representation Learning Meets Split Learning in Hierarchical Federated Learning
- Authors: Dung T. Tran, Nguyen B. Ha, Van-Dinh Nguyen, Kok-Seng Wong,
- Abstract summary: Federated learning (FL) is a promising approach for addressing scalability and latency issues in large-scale networks.<n>Previous works combine split learning (SL) and hierarchical FL (HierFL) to reduce device-side computation, but this introduces training complexity due to coordination across tiers.<n>We propose SHeRL-FL, which integrates SL and hierarchical model aggregation and incorporates representation learning at intermediate layers.
- Score: 5.447833971100033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a promising approach for addressing scalability and latency issues in large-scale networks by enabling collaborative model training without requiring the sharing of raw data. However, existing FL frameworks often overlook the computational heterogeneity of edge clients and the growing training burden on resource-limited devices. However, FL suffers from high communication costs and complex model aggregation, especially with large models. Previous works combine split learning (SL) and hierarchical FL (HierFL) to reduce device-side computation and improve scalability, but this introduces training complexity due to coordination across tiers. To address these issues, we propose SHeRL-FL, which integrates SL and hierarchical model aggregation and incorporates representation learning at intermediate layers. By allowing clients and edge servers to compute training objectives independently of the cloud, SHeRL-FL significantly reduces both coordination complexity and communication overhead. To evaluate the effectiveness and efficiency of SHeRL-FL, we performed experiments on image classification tasks using CIFAR-10, CIFAR-100, and HAM10000 with AlexNet, ResNet-18, and ResNet-50 in both IID and non-IID settings. In addition, we evaluate performance on image segmentation tasks using the ISIC-2018 dataset with a ResNet-50-based U-Net. Experimental results demonstrate that SHeRL-FL reduces data transmission by over 90\% compared to centralized FL and HierFL, and by 50\% compared to SplitFed, which is a hybrid of FL and SL, and further improves hierarchical split learning methods.
Related papers
- Federated Learning over Hierarchical Wireless Networks: Training Latency Minimization via Submodel Partitioning [15.311309249848739]
Hierarchical independent submodel training (HIST) is a new FL methodology that aims to address these issues in hierarchical cloud-edge-client networks.<n>We demonstrate how HIST can be augmented with over-the-air computation (AirComp) to further enhance the efficiency of the model aggregation over the edge cells.
arXiv Detail & Related papers (2023-10-27T04:42:59Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - NeFL: Nested Model Scaling for Federated Learning with System Heterogeneous Clients [44.89061671579694]
Federated learning (FL) enables distributed training while preserving data privacy, but stragglers-slow or incapable clients-can significantly slow down the total training time and degrade performance.
We propose nested federated learning (NeFL), a framework that efficiently divides deep neural networks into submodels using both depthwise and widthwise scaling.
NeFL achieves performance gain, especially for the worst-case submodel compared to baseline approaches.
arXiv Detail & Related papers (2023-08-15T13:29:14Z) - Federated Split Learning with Only Positive Labels for
resource-constrained IoT environment [4.055662817794178]
Distributed collaborative machine learning (DCML) is a promising method in the Internet of Things (IoT) domain for training deep learning models.
We propose splitfed learning, known as splitfed learning (SFL), is the most suitable for efficient training and testing when devices have limited computational capabilities.
We show that SFPL outperforms SFL when resource-constrained IoT devices have only positive labeled data.
arXiv Detail & Related papers (2023-07-25T05:33:06Z) - Adaptive Federated Pruning in Hierarchical Wireless Networks [69.6417645730093]
Federated Learning (FL) is a privacy-preserving distributed learning framework where a server aggregates models updated by multiple devices without accessing their private datasets.
In this paper, we introduce model pruning for HFL in wireless networks to reduce the neural network scale.
We show that our proposed HFL with model pruning achieves similar learning accuracy compared with the HFL without model pruning and reduces about 50 percent communication cost.
arXiv Detail & Related papers (2023-05-15T22:04:49Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - Efficient Parallel Split Learning over Resource-constrained Wireless
Edge Networks [44.37047471448793]
In this paper, we advocate the integration of edge computing paradigm and parallel split learning (PSL)
We propose an innovative PSL framework, namely, efficient parallel split learning (EPSL) to accelerate model training.
We show that the proposed EPSL framework significantly decreases the training latency needed to achieve a target accuracy.
arXiv Detail & Related papers (2023-03-26T16:09:48Z) - Hierarchical Personalized Federated Learning Over Massive Mobile Edge
Computing Networks [95.39148209543175]
We propose hierarchical PFL (HPFL), an algorithm for deploying PFL over massive MEC networks.
HPFL combines the objectives of training loss minimization and round latency minimization while jointly determining the optimal bandwidth allocation.
arXiv Detail & Related papers (2023-03-19T06:00:05Z) - Time-sensitive Learning for Heterogeneous Federated Edge Intelligence [52.83633954857744]
We investigate real-time machine learning in a federated edge intelligence (FEI) system.
FEI systems exhibit heterogenous communication and computational resource distribution.
We propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model.
arXiv Detail & Related papers (2023-01-26T08:13:22Z) - Predictive GAN-powered Multi-Objective Optimization for Hybrid Federated
Split Learning [56.125720497163684]
We propose a hybrid federated split learning framework in wireless networks.
We design a parallel computing scheme for model splitting without label sharing, and theoretically analyze the influence of the delayed gradient caused by the scheme on the convergence speed.
arXiv Detail & Related papers (2022-09-02T10:29:56Z) - Splitfed learning without client-side synchronization: Analyzing
client-side split network portion size to overall performance [4.689140226545214]
Federated Learning (FL), Split Learning (SL), and SplitFed Learning (SFL) are three recent developments in distributed machine learning.
This paper studies SFL without client-side model synchronization.
It provides only 1%-2% better accuracy than Multi-head Split Learning on the MNIST test set.
arXiv Detail & Related papers (2021-09-19T22:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.