Related papers: Ampere: Communication-Efficient and High-Accuracy Split Federated Learning

Ampere: Communication-Efficient and High-Accuracy Split Federated Learning

URL: http://arxiv.org/abs/2507.07130v1
Date: Tue, 08 Jul 2025 20:54:43 GMT
Title: Ampere: Communication-Efficient and High-Accuracy Split Federated Learning
Authors: Zihan Zhang, Leon Wong, Blesson Varghese,
Abstract summary: A Federated Learning (FL) system collaboratively trains neural networks across devices and a server but is limited by significant on-device computation costs.<n>We propose Ampere, a novel collaborative training system that simultaneously minimizes on-device computation and device-server communication.<n>A lightweight auxiliary network generation method decouples training between the device and server, reducing frequent intermediate exchanges to a single transfer.
Score: 19.564340315424413
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A Federated Learning (FL) system collaboratively trains neural networks across devices and a server but is limited by significant on-device computation costs. Split Federated Learning (SFL) systems mitigate this by offloading a block of layers of the network from the device to a server. However, in doing so, it introduces large communication overheads due to frequent exchanges of intermediate activations and gradients between devices and the server and reduces model accuracy for non-IID data. We propose Ampere, a novel collaborative training system that simultaneously minimizes on-device computation and device-server communication while improving model accuracy. Unlike SFL, which uses a global loss by iterative end-to-end training, Ampere develops unidirectional inter-block training to sequentially train the device and server block with a local loss, eliminating the transfer of gradients. A lightweight auxiliary network generation method decouples training between the device and server, reducing frequent intermediate exchanges to a single transfer, which significantly reduces the communication overhead. Ampere mitigates the impact of data heterogeneity by consolidating activations generated by the trained device block to train the server block, in contrast to SFL, which trains on device-specific, non-IID activations. Extensive experiments on multiple CNNs and transformers show that, compared to state-of-the-art SFL baseline systems, Ampere (i) improves model accuracy by up to 13.26% while reducing training time by up to 94.6%, (ii) reduces device-server communication overhead by up to 99.1% and on-device computation by up to 93.13%, and (iii) reduces standard deviation of accuracy by 53.39% for various non-IID degrees highlighting superior performance when faced with heterogeneous data.

Related papers

Federated Split Learning with Improved Communication and Storage Efficiency [9.277834710955766]
Federated learning (FL) is one of the popular distributed machine learning (ML) solutions but incurs significant communication and computation costs at edge devices.<n>This paper proposes a novel communication and storage efficient federated split learning method, CSE-FSL, which utilizes an auxiliary network to locally update the clients while keeping a single model at the server.
arXiv Detail & Related papers (2025-07-21T17:21:16Z)
Resource Utilization Optimized Federated Learning [19.564340315424413]
Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices.<n>This paper introduces FedOptima, a resource-optimized FL system designed to simultaneously minimize both types of idle time.
arXiv Detail & Related papers (2025-03-10T20:23:39Z)
Communication Efficient ConFederated Learning: An Event-Triggered SAGA Approach [67.27031215756121]
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data over various data sources. Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability. In this work, we consider a multi-server FL framework, referred to as emphConfederated Learning (CFL) in order to accommodate a larger number of users.
arXiv Detail & Related papers (2024-02-28T03:27:10Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
Edge-assisted U-Shaped Split Federated Learning with Privacy-preserving for Internet of Things [4.68267059122563]
We present an innovative Edge-assisted U-Shaped Split Federated Learning (EUSFL) framework, which harnesses the high-performance capabilities of edge servers. In this framework, we leverage Federated Learning (FL) to enable data holders to collaboratively train models without sharing their data. We also propose a novel noise mechanism called LabelDP to ensure that data features and labels can securely resist reconstruction attacks.
arXiv Detail & Related papers (2023-11-08T05:14:41Z)
Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z)
Adaptive Federated Pruning in Hierarchical Wireless Networks [69.6417645730093]
Federated Learning (FL) is a privacy-preserving distributed learning framework where a server aggregates models updated by multiple devices without accessing their private datasets. In this paper, we introduce model pruning for HFL in wireless networks to reduce the neural network scale. We show that our proposed HFL with model pruning achieves similar learning accuracy compared with the HFL without model pruning and reduces about 50 percent communication cost.
arXiv Detail & Related papers (2023-05-15T22:04:49Z)
Efficient Parallel Split Learning over Resource-constrained Wireless Edge Networks [44.37047471448793]
In this paper, we advocate the integration of edge computing paradigm and parallel split learning (PSL) We propose an innovative PSL framework, namely, efficient parallel split learning (EPSL) to accelerate model training. We show that the proposed EPSL framework significantly decreases the training latency needed to achieve a target accuracy.
arXiv Detail & Related papers (2023-03-26T16:09:48Z)
Time-sensitive Learning for Heterogeneous Federated Edge Intelligence [52.83633954857744]
We investigate real-time machine learning in a federated edge intelligence (FEI) system. FEI systems exhibit heterogenous communication and computational resource distribution. We propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model.
arXiv Detail & Related papers (2023-01-26T08:13:22Z)
Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices. We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST) FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z)
FedAdapt: Adaptive Offloading for IoT Devices in Federated Learning [2.5775113252104216]
Federated Learning (FL) on Internet-of-Things devices is necessitated by the large volumes of data they produce and growing concerns of data privacy. This paper presents FedAdapt, an adaptive offloading FL framework to mitigate the aforementioned challenges.
arXiv Detail & Related papers (2021-07-09T07:29:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.