Related papers: FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout

FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout

URL: http://arxiv.org/abs/2307.02623v3
Date: Tue, 26 Sep 2023 19:57:01 GMT
Title: FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout
Authors: Irene Wang, Prashant J. Nair, Divya Mahajan
Abstract summary: Federated Learning allows machine learning models to train locally on individual mobile devices, synchronizing model updates via a shared server. As a result, straggler devices with lower performance often dictate the overall training time in FL. We introduce Invariant Dropout, a method that extracts a sub-model based on the weight update threshold. We develop an adaptive training framework, Federated Learning using Invariant Dropout.
Score: 1.8262547855491458
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated Learning (FL) allows machine learning models to train locally on individual mobile devices, synchronizing model updates via a shared server. This approach safeguards user privacy; however, it also generates a heterogeneous training environment due to the varying performance capabilities across devices. As a result, straggler devices with lower performance often dictate the overall training time in FL. In this work, we aim to alleviate this performance bottleneck due to stragglers by dynamically balancing the training load across the system. We introduce Invariant Dropout, a method that extracts a sub-model based on the weight update threshold, thereby minimizing potential impacts on accuracy. Building on this dropout technique, we develop an adaptive training framework, Federated Learning using Invariant Dropout (FLuID). FLuID offers a lightweight sub-model extraction to regulate computational intensity, thereby reducing the load on straggler devices without affecting model quality. Our method leverages neuron updates from non-straggler devices to construct a tailored sub-model for each straggler based on client performance profiling. Furthermore, FLuID can dynamically adapt to changes in stragglers as runtime conditions shift. We evaluate FLuID using five real-world mobile clients. The evaluations show that Invariant Dropout maintains baseline model efficiency while alleviating the performance bottleneck of stragglers through a dynamic, runtime approach.

Related papers

Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning. As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers. We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z)
Efficient Language Model Architectures for Differentially Private Federated Learning [21.280600854272716]
Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices. In centralized training of language models, adaptives are preferred as they offer improved stability and performance. We propose a scale-in Coupled Input Forget Gate (SI CIFG) recurrent network by modifying the sigmoid and tanh activations in neural recurrent cell.
arXiv Detail & Related papers (2024-03-12T22:21:48Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
Adaptive Model Pruning and Personalization for Federated Learning over Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy. We consider a FL framework with partial model pruning and personalization to overcome these challenges. This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z)
DynamicFL: Balancing Communication Dynamics and Client Manipulation for Federated Learning [6.9138560535971605]
Federated Learning (FL) aims to train a global model by exploiting the decentralized data across millions of edge devices. Given the geo-distributed edge devices with highly dynamic networks in the wild, aggregating all the model updates from those participating devices will result in inevitable long-tail delays in FL. We propose a novel FL framework, DynamicFL, by considering the communication dynamics and data quality across massive edge devices with a specially designed client manipulation strategy.
arXiv Detail & Related papers (2023-07-16T19:09:31Z)
MetaNetwork: A Task-agnostic Network Parameters Generation Framework for Improving Device Model Generalization [65.02542875281233]
We propose a novel task-agnostic framework, named MetaNetwork, for generating adaptive device model parameters from cloud without on-device training. The MetaGenerator is designed to learn a mapping function from samples to model parameters, and it can generate and deliver the adaptive parameters to the device based on samples uploaded from the device to the cloud. The MetaStabilizer aims to reduce the oscillation of the MetaGenerator, accelerate the convergence and improve the model performance during both training and inference.
arXiv Detail & Related papers (2022-09-12T13:26:26Z)
Reducing Impacts of System Heterogeneity in Federated Learning using Weight Update Magnitudes [0.0]
Federated learning enables machine learning models to train locally on each handheld device while only synchronizing their neuron updates with a server. This results in the training time of federated learning tasks being dictated by a few low-performance straggler devices. In this work, we aim to mitigate the performance bottleneck of federated learning by dynamically forming sub-models for stragglers based on their performance and accuracy feedback.
arXiv Detail & Related papers (2022-08-30T00:39:06Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
Federated Dropout -- A Simple Approach for Enabling Federated Learning on Resource Constrained Devices [40.69663094185572]
Federated learning (FL) is a popular framework for training an AI model using distributed mobile data in a wireless network. One main challenge confronting practical FL is that resource constrained devices struggle with the computation intensive task of updating a deep-neural network model. To tackle the challenge, in this paper, a federated dropout (FedDrop) scheme is proposed building on the classic dropout scheme for random model pruning.
arXiv Detail & Related papers (2021-09-30T16:52:13Z)
Dynamic Attention-based Communication-Efficient Federated Learning [85.18941440826309]
Federated learning (FL) offers a solution to train a global machine learning model. FL suffers performance degradation when client data distribution is non-IID. We propose a new adaptive training algorithm $textttAdaFL$ to combat this degradation.
arXiv Detail & Related papers (2021-08-12T14:18:05Z)
Adaptive Dynamic Pruning for Non-IID Federated Learning [3.8666113275834335]
Federated Learning(FL) has emerged as a new paradigm of training machine learning models without sacrificing data security and privacy. We present an adaptive pruning scheme for edge devices in an FL system, which applies dataset-aware dynamic pruning for inference acceleration on Non-IID datasets.
arXiv Detail & Related papers (2021-06-13T05:27:43Z)
Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models. We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm. We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z)
UVeQFed: Universal Vector Quantization for Federated Learning [179.06583469293386]
Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model. We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion.
arXiv Detail & Related papers (2020-06-05T07:10:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.