ResFed: Communication Efficient Federated Learning by Transmitting Deep
Compressed Residuals
- URL: http://arxiv.org/abs/2212.05602v1
- Date: Sun, 11 Dec 2022 20:34:52 GMT
- Title: ResFed: Communication Efficient Federated Learning by Transmitting Deep
Compressed Residuals
- Authors: Rui Song, Liguo Zhou, Lingjuan Lyu, Andreas Festag, Alois Knoll
- Abstract summary: Federated learning enables cooperative training among massively distributed clients by sharing their learned local model parameters.
We introduce a residual-based federated learning framework (ResFed), where residuals rather than model parameters are transmitted in communication networks for training.
By employing a common prediction rule, both locally and globally updated models are always fully recoverable in clients and the server.
- Score: 24.13593410107805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning enables cooperative training among massively distributed
clients by sharing their learned local model parameters. However, with
increasing model size, deploying federated learning requires a large
communication bandwidth, which limits its deployment in wireless networks. To
address this bottleneck, we introduce a residual-based federated learning
framework (ResFed), where residuals rather than model parameters are
transmitted in communication networks for training. In particular, we integrate
two pairs of shared predictors for the model prediction in both
server-to-client and client-to-server communication. By employing a common
prediction rule, both locally and globally updated models are always fully
recoverable in clients and the server. We highlight that the residuals only
indicate the quasi-update of a model in a single inter-round, and hence contain
more dense information and have a lower entropy than the model, comparing to
model weights and gradients. Based on this property, we further conduct lossy
compression of the residuals by sparsification and quantization and encode them
for efficient communication. The experimental evaluation shows that our ResFed
needs remarkably less communication costs and achieves better accuracy by
leveraging less sensitive residuals, compared to standard federated learning.
For instance, to train a 4.08 MB CNN model on CIFAR-10 with 10 clients under
non-independent and identically distributed (Non-IID) setting, our approach
achieves a compression ratio over 700X in each communication round with minimum
impact on the accuracy. To reach an accuracy of 70%, it saves around 99% of the
total communication volume from 587.61 Mb to 6.79 Mb in up-streaming and to
4.61 Mb in down-streaming on average for all clients.
Related papers
- Communication-Efficient Federated Learning with Adaptive Compression under Dynamic Bandwidth [6.300376113680886]
Federated learning can train models without directly providing local data to the server.
Recent scholars have achieved the communication efficiency of federated learning mainly by model compression.
We show the performance of AdapComFL algorithm, and compare it with existing algorithms.
arXiv Detail & Related papers (2024-05-06T08:00:43Z) - Fed-CVLC: Compressing Federated Learning Communications with
Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds.
We show strong evidences that variable-length is beneficial for compression in FL.
We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - SalientGrads: Sparse Models for Communication Efficient and Data Aware
Distributed Federated Training [1.0413504599164103]
Federated learning (FL) enables the training of a model leveraging decentralized data in client sites while preserving privacy by not collecting data.
One of the significant challenges of FL is limited computation and low communication bandwidth in resource limited edge client nodes.
We propose Salient Grads, which simplifies the process of sparse training by choosing a data aware subnetwork before training.
arXiv Detail & Related papers (2023-04-15T06:46:37Z) - FedCliP: Federated Learning with Client Pruning [3.796320380104124]
Federated learning (FL) is a newly emerging distributed learning paradigm.
One fundamental bottleneck in FL is the heavy communication overheads between the distributed clients and the central server.
We propose FedCliP, the first communication efficient FL training framework from a macro perspective.
arXiv Detail & Related papers (2023-01-17T09:15:37Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - FedGCN: Convergence-Communication Tradeoffs in Federated Training of
Graph Convolutional Networks [14.824579000821272]
We introduce the Federated Graph Convolutional Network (FedGCN) algorithm, which uses federated learning to train GCN models for semi-supervised node classification.
Compared to prior methods that require extra communication among clients at each training round, FedGCN clients only communicate with the central server in one pre-training step.
Experimental results show that our FedGCN algorithm achieves better model accuracy with 51.7% faster convergence on average and at least 100X less communication compared to prior work.
arXiv Detail & Related papers (2022-01-28T21:39:16Z) - Comfetch: Federated Learning of Large Networks on Constrained Clients
via Sketching [28.990067638230254]
Federated learning (FL) is a popular paradigm for private and collaborative model training on the edge.
We propose a novel algorithm, Comdirectional, which allows clients to train large networks using representations of the global neural network.
arXiv Detail & Related papers (2021-09-17T04:48:42Z) - FedKD: Communication Efficient Federated Learning via Knowledge
Distillation [56.886414139084216]
Federated learning is widely used to learn intelligent models from decentralized data.
In federated learning, clients need to communicate their local model updates in each iteration of model learning.
We propose a communication efficient federated learning method based on knowledge distillation.
arXiv Detail & Related papers (2021-08-30T15:39:54Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - Training Recommender Systems at Scale: Communication-Efficient Model and
Data Parallelism [56.78673028601739]
We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training.
DCT reduces communication by at least $100times$ and $20times$ during DP and MP, respectively.
It improves end-to-end training time for a state-of-the-art industrial recommender model by 37%, without any loss in performance.
arXiv Detail & Related papers (2020-10-18T01:44:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.