Fast Server Learning Rate Tuning for Coded Federated Dropout
- URL: http://arxiv.org/abs/2201.11036v1
- Date: Wed, 26 Jan 2022 16:19:04 GMT
- Title: Fast Server Learning Rate Tuning for Coded Federated Dropout
- Authors: Giacomo Verardo, Daniel Barreira, Marco Chiesa and Dejan Kostic
- Abstract summary: Federated Dropout (FD) is a technique that improves the communication efficiency of a FL session.
We leverage coding theory to enhance FD by allowing a different sub-model to be used at each client.
For the EMNIST dataset, our mechanism achieves 99.6 % of the final accuracy of the no dropout case.
- Score: 3.9653673778225946
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In cross-device Federated Learning (FL), clients with low computational power
train a common machine model by exchanging parameters updates instead of
potentially private data. Federated Dropout (FD) is a technique that improves
the communication efficiency of a FL session by selecting a subset of model
variables to be updated in each training round. However, FD produces
considerably lower accuracy and higher convergence time compared to standard
FL. In this paper, we leverage coding theory to enhance FD by allowing a
different sub-model to be used at each client. We also show that by carefully
tuning the server learning rate hyper-parameter, we can achieve higher training
speed and up to the same final accuracy of the no dropout case. For the EMNIST
dataset, our mechanism achieves 99.6 % of the final accuracy of the no dropout
case while requiring 2.43x less bandwidth to achieve this accuracy level.
Related papers
- SPD-CFL: Stepwise Parameter Dropout for Efficient Continual Federated Learning [18.917283498639442]
We propose the Step-wise.<n>Dropout for Continual Federated Learning (SPD-CFL) approach.<n>It allows users to specify the target level of performance and then attempts to find the most suitable dropout rate for the given FL model.<n>It achieves a 2.07% higher test AUC while reducing communication overhead by 29.53%.
arXiv Detail & Related papers (2024-05-15T14:50:46Z) - Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data.
The training process of Large Language Models (LLMs) generally incurs the update of significant parameters.
This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Optimizing the Communication-Accuracy Trade-off in Federated Learning
with Rate-Distortion Theory [1.5771347525430772]
A significant bottleneck in federated learning is the network communication cost of sending model updates from client devices to the central server.
Our method encodes quantized updates with an appropriate universal code, taking into account their empirical distribution.
Because quantization introduces error, we select quantization levels by optimizing for the desired trade-off in average total gradient and distortion.
arXiv Detail & Related papers (2022-01-07T20:17:33Z) - Automatic Tuning of Federated Learning Hyper-Parameters from System
Perspective [15.108050457914516]
Federated learning (FL) is a distributed model training paradigm that preserves clients' data privacy.
We propose FedTuning, an automatic FL hyper- parameter tuning algorithm tailored to applications' diverse system requirements of FL training.
FedTuning is lightweight and flexible, achieving an average of 41% improvement for different training preferences on time, computation, and communication.
arXiv Detail & Related papers (2021-10-06T20:43:25Z) - Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models.
We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm.
We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z) - Federated learning with hierarchical clustering of local updates to
improve training on non-IID data [3.3517146652431378]
We show that learning a single joint model is often not optimal in the presence of certain types of non-iid data.
We present a modification to FL by introducing a hierarchical clustering step (FL+HC)
We show how FL+HC allows model training to converge in fewer communication rounds compared to FL without clustering.
arXiv Detail & Related papers (2020-04-24T15:16:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.