FLuID: Mitigating Stragglers in Federated Learning using Invariant
Dropout
- URL: http://arxiv.org/abs/2307.02623v3
- Date: Tue, 26 Sep 2023 19:57:01 GMT
- Title: FLuID: Mitigating Stragglers in Federated Learning using Invariant
Dropout
- Authors: Irene Wang, Prashant J. Nair, Divya Mahajan
- Abstract summary: Federated Learning allows machine learning models to train locally on individual mobile devices, synchronizing model updates via a shared server.
As a result, straggler devices with lower performance often dictate the overall training time in FL.
We introduce Invariant Dropout, a method that extracts a sub-model based on the weight update threshold.
We develop an adaptive training framework, Federated Learning using Invariant Dropout.
- Score: 1.8262547855491458
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Learning (FL) allows machine learning models to train locally on
individual mobile devices, synchronizing model updates via a shared server.
This approach safeguards user privacy; however, it also generates a
heterogeneous training environment due to the varying performance capabilities
across devices. As a result, straggler devices with lower performance often
dictate the overall training time in FL. In this work, we aim to alleviate this
performance bottleneck due to stragglers by dynamically balancing the training
load across the system. We introduce Invariant Dropout, a method that extracts
a sub-model based on the weight update threshold, thereby minimizing potential
impacts on accuracy. Building on this dropout technique, we develop an adaptive
training framework, Federated Learning using Invariant Dropout (FLuID). FLuID
offers a lightweight sub-model extraction to regulate computational intensity,
thereby reducing the load on straggler devices without affecting model quality.
Our method leverages neuron updates from non-straggler devices to construct a
tailored sub-model for each straggler based on client performance profiling.
Furthermore, FLuID can dynamically adapt to changes in stragglers as runtime
conditions shift. We evaluate FLuID using five real-world mobile clients. The
evaluations show that Invariant Dropout maintains baseline model efficiency
while alleviating the performance bottleneck of stragglers through a dynamic,
runtime approach.
Related papers
- Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - Efficient Language Model Architectures for Differentially Private
Federated Learning [21.280600854272716]
Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices.
In centralized training of language models, adaptives are preferred as they offer improved stability and performance.
We propose a scale-in Coupled Input Forget Gate (SI CIFG) recurrent network by modifying the sigmoid and tanh activations in neural recurrent cell.
arXiv Detail & Related papers (2024-03-12T22:21:48Z) - Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - DynamicFL: Balancing Communication Dynamics and Client Manipulation for
Federated Learning [6.9138560535971605]
Federated Learning (FL) aims to train a global model by exploiting the decentralized data across millions of edge devices.
Given the geo-distributed edge devices with highly dynamic networks in the wild, aggregating all the model updates from those participating devices will result in inevitable long-tail delays in FL.
We propose a novel FL framework, DynamicFL, by considering the communication dynamics and data quality across massive edge devices with a specially designed client manipulation strategy.
arXiv Detail & Related papers (2023-07-16T19:09:31Z) - MetaNetwork: A Task-agnostic Network Parameters Generation Framework for
Improving Device Model Generalization [65.02542875281233]
We propose a novel task-agnostic framework, named MetaNetwork, for generating adaptive device model parameters from cloud without on-device training.
The MetaGenerator is designed to learn a mapping function from samples to model parameters, and it can generate and deliver the adaptive parameters to the device based on samples uploaded from the device to the cloud.
The MetaStabilizer aims to reduce the oscillation of the MetaGenerator, accelerate the convergence and improve the model performance during both training and inference.
arXiv Detail & Related papers (2022-09-12T13:26:26Z) - Reducing Impacts of System Heterogeneity in Federated Learning using
Weight Update Magnitudes [0.0]
Federated learning enables machine learning models to train locally on each handheld device while only synchronizing their neuron updates with a server.
This results in the training time of federated learning tasks being dictated by a few low-performance straggler devices.
In this work, we aim to mitigate the performance bottleneck of federated learning by dynamically forming sub-models for stragglers based on their performance and accuracy feedback.
arXiv Detail & Related papers (2022-08-30T00:39:06Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Federated Dropout -- A Simple Approach for Enabling Federated Learning
on Resource Constrained Devices [40.69663094185572]
Federated learning (FL) is a popular framework for training an AI model using distributed mobile data in a wireless network.
One main challenge confronting practical FL is that resource constrained devices struggle with the computation intensive task of updating a deep-neural network model.
To tackle the challenge, in this paper, a federated dropout (FedDrop) scheme is proposed building on the classic dropout scheme for random model pruning.
arXiv Detail & Related papers (2021-09-30T16:52:13Z) - Adaptive Dynamic Pruning for Non-IID Federated Learning [3.8666113275834335]
Federated Learning(FL) has emerged as a new paradigm of training machine learning models without sacrificing data security and privacy.
We present an adaptive pruning scheme for edge devices in an FL system, which applies dataset-aware dynamic pruning for inference acceleration on Non-IID datasets.
arXiv Detail & Related papers (2021-06-13T05:27:43Z) - Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models.
We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm.
We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.