Related papers: Time-Correlated Sparsification for Communication-Efficient Federated Learning

Time-Correlated Sparsification for Communication-Efficient Federated Learning

URL: http://arxiv.org/abs/2101.08837v1
Date: Thu, 21 Jan 2021 20:15:55 GMT
Title: Time-Correlated Sparsification for Communication-Efficient Federated Learning
Authors: Emre Ozfatura and Kerem Ozfatura and Deniz Gunduz
Abstract summary: Federated learning (FL) enables multiple clients to collaboratively train a shared model without disclosing their local datasets. We introduce a novel time-correlated sparsification scheme, which seeks a certain correlation between the sparse representations used at consecutive iterations in FL. We show that TCS can achieve centralized training accuracy with 100 times sparsification, and up to 2000 times reduction in the communication load when employed together with quantization.
Score: 6.746400031322727
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning (FL) enables multiple clients to collaboratively train a shared model without disclosing their local datasets. This is achieved by exchanging local model updates with the help of a parameter server (PS). However, due to the increasing size of the trained models, the communication load due to the iterative exchanges between the clients and the PS often becomes a bottleneck in the performance. Sparse communication is often employed to reduce the communication load, where only a small subset of the model updates are communicated from the clients to the PS. In this paper, we introduce a novel time-correlated sparsification (TCS) scheme, which builds upon the notion that sparse communication framework can be considered as identifying the most significant elements of the underlying model. Hence, TCS seeks a certain correlation between the sparse representations used at consecutive iterations in FL, so that the overhead due to encoding and transmission of the sparse representation can be significantly reduced without compromising the test accuracy. Through extensive simulations on the CIFAR-10 dataset, we show that TCS can achieve centralized training accuracy with 100 times sparsification, and up to 2000 times reduction in the communication load when employed together with quantization.

Related papers

Communication-Efficient Federated Learning Based on Explanation-Guided Pruning for Remote Sensing Image Classification [2.725507329935916]
We introduce an explanation-guided pruning strategy for communication-efficient Federated Learning (FL) Our strategy effectively reduces the number of shared model updates, while increasing the ability of the global model. The code of this work will be publicly available at https://git.tu-berlin.de/rsim/FL-LRP.
arXiv Detail & Related papers (2025-01-20T13:59:41Z)
FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models [56.21666819468249]
Federated Learning (FL) has garnered increasing attention due to its unique characteristic of allowing heterogeneous clients to process their private data locally and interact with a central server. We introduce FedComLoc, integrating practical and effective compression into emphScaffnew to further enhance communication efficiency.
arXiv Detail & Related papers (2024-03-14T22:29:59Z)
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds. We show strong evidences that variable-length is beneficial for compression in FL. We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z)
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation [10.541541376305245]
Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices. FL is hindered by excessive communication costs due to repeated server-client communication during training. We propose FedCompress, a novel approach that combines dynamic weight clustering and server-side knowledge distillation.
arXiv Detail & Related papers (2024-01-25T14:49:15Z)
Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach [54.311495894129585]
We study the limit of communication cost of model aggregation in distributed learning from a rate-distortion perspective. It is found that the communication gain by exploiting the correlation between worker nodes is significant for SignSGD.
arXiv Detail & Related papers (2022-06-28T13:10:40Z)
On the Convergence Time of Federated Learning Over Wireless Networks Under Imperfect CSI [28.782485580296374]
We propose a training process that takes channel statistics as a bias to minimize the convergence time under imperfect CSI. We also examine the trade-off between number of clients involved in the training process and model accuracy as a function of different fading regimes.
arXiv Detail & Related papers (2021-04-01T08:30:45Z)
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning [75.45968495410047]
Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning. Gradient quantization is an effective way of reducing the number of bits required to communicate each model update. We propose an adaptive quantization strategy called AdaFL that aims to achieve communication efficiency as well as a low error floor.
arXiv Detail & Related papers (2021-02-08T19:14:21Z)
CosSGD: Nonlinear Quantization for Communication-efficient Federated Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server. We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning. Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z)
Distributed Sparse SGD with Majority Voting [5.32836690371986]
We introduce a majority voting based sparse communication strategy for distributed learning. We show that it is possible to achieve up to x4000 compression without any loss in the test accuracy.
arXiv Detail & Related papers (2020-11-12T17:06:36Z)
Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism [56.78673028601739]
We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT reduces communication by at least $100times$ and $20times$ during DP and MP, respectively. It improves end-to-end training time for a state-of-the-art industrial recommender model by 37%, without any loss in performance.
arXiv Detail & Related papers (2020-10-18T01:44:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.