Related papers: ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity

ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity

URL: http://arxiv.org/abs/2208.02507v1
Date: Thu, 4 Aug 2022 07:37:07 GMT
Title: ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity
Authors: Xinchi Qiu, Javier Fernandez-Marques, Pedro PB Gusmao, Yan Gao, Titouan Parcollet, Nicholas Donald Lane
Abstract summary: In Federated Learning (FL), nodes are orders of magnitude more constrained than traditional server-grade hardware. We propose ZeroFL, a framework that relies on highly sparse operations to accelerate on-device training.
Score: 15.908499928588297
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When the available hardware cannot meet the memory and compute requirements to efficiently train high performing machine learning models, a compromise in either the training quality or the model complexity is needed. In Federated Learning (FL), nodes are orders of magnitude more constrained than traditional server-grade hardware and are often battery powered, severely limiting the sophistication of models that can be trained under this paradigm. While most research has focused on designing better aggregation strategies to improve convergence rates and in alleviating the communication costs of FL, fewer efforts have been devoted to accelerating on-device training. Such stage, which repeats hundreds of times (i.e. every round) and can involve thousands of devices, accounts for the majority of the time required to train federated models and, the totality of the energy consumption at the client side. In this work, we present the first study on the unique aspects that arise when introducing sparsity at training time in FL workloads. We then propose ZeroFL, a framework that relies on highly sparse operations to accelerate on-device training. Models trained with ZeroFL and 95% sparsity achieve up to 2.3% higher accuracy compared to competitive baselines obtained from adapting a state-of-the-art sparse training framework to the FL setting.

Related papers

Federated Learning with Workload Reduction through Partial Training of Client Models and Entropy-Based Data Selection [3.9981390090442694]
We propose FedFT-EDS, a novel approach that combines Fine-Tuning of partial client models with Entropy-based Data Selection to reduce training workloads on edge devices. Our experiments show that FedFT-EDS uses only 50% user data while improving the global model performance compared to baseline methods, FedAvg and FedProx. FedFT-EDS improves client learning efficiency by up to 3 times, using one third of training time on clients to achieve an equivalent performance to the baselines.
arXiv Detail & Related papers (2024-12-30T22:47:32Z)
Embracing Federated Learning: Enabling Weak Client Participation via Partial Model Training [21.89214794178211]
In Federated Learning (FL), clients may have weak devices that cannot train the full model or even hold it in their memory space. We propose EmbracingFL, a general FL framework that allows all available clients to join the distributed training. Our empirical study shows that EmbracingFL consistently achieves high accuracy as like all clients are strong, outperforming the state-of-the-art width reduction methods.
arXiv Detail & Related papers (2024-06-21T13:19:29Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
Speed Up Federated Learning in Heterogeneous Environment: A Dynamic Tiering Approach [5.504000607257414]
Federated learning (FL) enables collaboratively training a model while keeping the training data decentralized and private. One significant impediment to training a model using FL, especially large models, is the resource constraints of devices with heterogeneous computation and communication capacities as well as varying task sizes. We propose the Dynamic Tiering-based Federated Learning (DTFL) system where slower clients dynamically offload part of the model to the server to alleviate resource constraints and speed up training.
arXiv Detail & Related papers (2023-12-09T19:09:19Z)
Semi-Federated Learning: Convergence Analysis and Optimization of A Hybrid Learning Framework [70.83511997272457]
We propose a semi-federated learning (SemiFL) paradigm to leverage both the base station (BS) and devices for a hybrid implementation of centralized learning (CL) and FL. We propose a two-stage algorithm to solve this intractable problem, in which we provide the closed-form solutions to the beamformers.
arXiv Detail & Related papers (2023-10-04T03:32:39Z)
Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z)
TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with Adaptive Partial Training [17.84692242938424]
TimelyFL is a heterogeneous-aware asynchronous Federated Learning framework with adaptive partial training. We show that TimelyFL improves participation rate by 21.13%, harvests 1.28x - 2.89x more efficiency on convergence rate, and provides a 6.25% increment on test accuracy.
arXiv Detail & Related papers (2023-04-14T06:26:08Z)
Hierarchical Personalized Federated Learning Over Massive Mobile Edge Computing Networks [95.39148209543175]
We propose hierarchical PFL (HPFL), an algorithm for deploying PFL over massive MEC networks. HPFL combines the objectives of training loss minimization and round latency minimization while jointly determining the optimal bandwidth allocation.
arXiv Detail & Related papers (2023-03-19T06:00:05Z)
Conquering the Communication Constraints to Enable Large Pre-Trained Models in Federated Learning [18.12162136918301]
Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices. Recent state-of-the-art pre-trained models are getting more capable but also have more parameters. Can we find a solution to enable those strong and readily-available pre-trained models in FL to achieve excellent performance while simultaneously reducing the communication burden? Specifically, we systemically evaluate the performance of FedPEFT across a variety of client stability, data distribution, and differential privacy settings.
arXiv Detail & Related papers (2022-10-04T16:08:54Z)
Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices. We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST) FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z)
Efficient Device Scheduling with Multi-Job Federated Learning [64.21733164243781]
We propose a novel multi-job Federated Learning framework to enable the parallel training process of multiple jobs. We propose a reinforcement learning-based method and a Bayesian optimization-based method to schedule devices for multiple jobs while minimizing the cost. Our proposed approaches significantly outperform baseline approaches in terms of training time (up to 8.67 times faster) and accuracy (up to 44.6% higher)
arXiv Detail & Related papers (2021-12-11T08:05:11Z)
Evaluation and Optimization of Distributed Machine Learning Techniques for Internet of Things [34.544836653715244]
Federated learning (FL) and split learning (SL) are state-of-the-art distributed machine learning techniques. Recent FL and SL are combined to form splitfed learning (SFL) to leverage each of their benefits. This work considers FL, SL, and SFL, and mount them on Raspberry Pi devices to evaluate their performance.
arXiv Detail & Related papers (2021-03-03T23:55:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.