Lean Clients, Full Accuracy: Hybrid Zeroth- and First-Order Split Federated Learning
- URL: http://arxiv.org/abs/2601.09076v1
- Date: Wed, 14 Jan 2026 02:17:49 GMT
- Title: Lean Clients, Full Accuracy: Hybrid Zeroth- and First-Order Split Federated Learning
- Authors: Zhoubin Kou, Zihan Chen, Jing Yang, Cong Shen,
- Abstract summary: Split Federated Learning (SFL) enables collaborative training between resource-constrained edge devices and a compute-rich server.<n> communication overhead is a central issue in SFL and can be mitigated with auxiliary networks.<n> HERON-SFL integrates zeroth-order (ZO) optimization for local client training while retaining first-order (FO) optimization on the server.
- Score: 13.865545923124055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Split Federated Learning (SFL) enables collaborative training between resource-constrained edge devices and a compute-rich server. Communication overhead is a central issue in SFL and can be mitigated with auxiliary networks. Yet, the fundamental client-side computation challenge remains, as back-propagation requires substantial memory and computation costs, severely limiting the scale of models that edge devices can support. To enable more resource-efficient client computation and reduce the client-server communication, we propose HERON-SFL, a novel hybrid optimization framework that integrates zeroth-order (ZO) optimization for local client training while retaining first-order (FO) optimization on the server. With the assistance of auxiliary networks, ZO updates enable clients to approximate local gradients using perturbed forward-only evaluations per step, eliminating memory-intensive activation caching and avoiding explicit gradient computation in the traditional training process. Leveraging the low effective rank assumption, we theoretically prove that HERON-SFL's convergence rate is independent of model dimensionality, addressing a key scalability concern common to ZO algorithms. Empirically, on ResNet training and language model (LM) fine-tuning tasks, HERON-SFL matches benchmark accuracy while reducing client peak memory by up to 64% and client-side compute cost by up to 33% per step, substantially expanding the range of models that can be trained or adapted on resource-limited devices.
Related papers
- CollaPipe: Adaptive Segment-Optimized Pipeline Parallelism for Collaborative LLM Training in Heterogeneous Edge Networks [57.95170323315603]
We introduce CollaPipe, a distributed learning framework that integrates collaborative pipeline parallelism with federated aggregation to support self-evolving networks.<n>In CollaPipe, the encoder part is adaptively partitioned into variable-sized segments and deployed across mobile devices for pipeline-parallel training, while the decoder is deployed on edge servers to handle generative tasks.<n>To enhance training efficiency, we formulate a joint optimization problem that adaptively allocates model segments, micro-batches, bandwidth, and transmission power.
arXiv Detail & Related papers (2025-09-24T07:54:01Z) - TurboSVM-FL: Boosting Federated Learning through SVM Aggregation for Lazy Clients [40.687124234279274]
TurboSVM-FL is a novel federated aggregation strategy that poses no additional computation burden on the client side.<n>We evaluate TurboSVM-FL on multiple datasets including FEMNIST, CelebA, and Shakespeare.
arXiv Detail & Related papers (2024-01-22T14:59:11Z) - Client Orchestration and Cost-Efficient Joint Optimization for
NOMA-Enabled Hierarchical Federated Learning [55.49099125128281]
We propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation.
We show that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.
arXiv Detail & Related papers (2023-11-03T13:34:44Z) - Effectively Heterogeneous Federated Learning: A Pairing and Split
Learning Based Approach [16.093068118849246]
This paper presents a novel split federated learning (SFL) framework that pairs clients with different computational resources.
A greedy algorithm is proposed by reconstructing the optimization of training latency as a graph edge selection problem.
Simulation results show the proposed method can significantly improve the FL training speed and achieve high performance.
arXiv Detail & Related papers (2023-08-26T11:10:54Z) - Adaptive Federated Pruning in Hierarchical Wireless Networks [69.6417645730093]
Federated Learning (FL) is a privacy-preserving distributed learning framework where a server aggregates models updated by multiple devices without accessing their private datasets.
In this paper, we introduce model pruning for HFL in wireless networks to reduce the neural network scale.
We show that our proposed HFL with model pruning achieves similar learning accuracy compared with the HFL without model pruning and reduces about 50 percent communication cost.
arXiv Detail & Related papers (2023-05-15T22:04:49Z) - Efficient Parallel Split Learning over Resource-constrained Wireless
Edge Networks [44.37047471448793]
In this paper, we advocate the integration of edge computing paradigm and parallel split learning (PSL)
We propose an innovative PSL framework, namely, efficient parallel split learning (EPSL) to accelerate model training.
We show that the proposed EPSL framework significantly decreases the training latency needed to achieve a target accuracy.
arXiv Detail & Related papers (2023-03-26T16:09:48Z) - Communication and Storage Efficient Federated Split Learning [19.369076939064904]
Federated Split Learning preserves the parallel model training principle of FL.
Server has to maintain separate models for every client, resulting in a significant computation and storage requirement.
This paper proposes a communication and storage efficient federated and split learning strategy.
arXiv Detail & Related papers (2023-02-11T04:44:29Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Comfetch: Federated Learning of Large Networks on Constrained Clients
via Sketching [28.990067638230254]
Federated learning (FL) is a popular paradigm for private and collaborative model training on the edge.
We propose a novel algorithm, Comdirectional, which allows clients to train large networks using representations of the global neural network.
arXiv Detail & Related papers (2021-09-17T04:48:42Z) - Dynamic Attention-based Communication-Efficient Federated Learning [85.18941440826309]
Federated learning (FL) offers a solution to train a global machine learning model.
FL suffers performance degradation when client data distribution is non-IID.
We propose a new adaptive training algorithm $textttAdaFL$ to combat this degradation.
arXiv Detail & Related papers (2021-08-12T14:18:05Z) - Joint Client Scheduling and Resource Allocation under Channel
Uncertainty in Federated Learning [47.97586668316476]
Federated learning (FL) over wireless networks depends on the reliability of the client-server connectivity and clients' local computation capabilities.
In this article, we investigate the problem of client scheduling and resource block (RB) allocation to enhance the performance of model training using FL.
A proposed method reduces the gap of the training accuracy loss by up to 40.7% compared to state-of-theart client scheduling and RB allocation methods.
arXiv Detail & Related papers (2021-06-12T15:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.