Stochastic Controlled Averaging for Federated Learning with Communication Compression
- URL: http://arxiv.org/abs/2308.08165v2
- Date: Tue, 9 Apr 2024 06:47:55 GMT
- Title: Stochastic Controlled Averaging for Federated Learning with Communication Compression
- Authors: Xinmeng Huang, Ping Li, Xiaoyun Li,
- Abstract summary: We propose two compressed FL algorithms, SCALLION and SCAFCOM, to support unbiased and biased compression.
Both the proposed methods outperform the existing compressed FL methods in terms of communication and computation complexities.
- Score: 36.09135695005069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Communication compression, a technique aiming to reduce the information volume to be transmitted over the air, has gained great interests in Federated Learning (FL) for the potential of alleviating its communication overhead. However, communication compression brings forth new challenges in FL due to the interplay of compression-incurred information distortion and inherent characteristics of FL such as partial participation and data heterogeneity. Despite the recent development, the performance of compressed FL approaches has not been fully exploited. The existing approaches either cannot accommodate arbitrary data heterogeneity or partial participation, or require stringent conditions on compression. In this paper, we revisit the seminal stochastic controlled averaging method by proposing an equivalent but more efficient/simplified formulation with halved uplink communication costs. Building upon this implementation, we propose two compressed FL algorithms, SCALLION and SCAFCOM, to support unbiased and biased compression, respectively. Both the proposed methods outperform the existing compressed FL methods in terms of communication and computation complexities. Moreover, SCALLION and SCAFCOM accommodates arbitrary data heterogeneity and do not make any additional assumptions on compression errors. Experiments show that SCALLION and SCAFCOM can match the performance of corresponding full-precision FL approaches with substantially reduced uplink communication, and outperform recent compressed FL methods under the same communication budget.
Related papers
- Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning [29.727339562140653]
Current data compression methods, such as sparsification in Federated Averaging (FedAvg), effectively enhance the communication efficiency of Federated Learning (FL)
These methods encounter challenges such as the straggler problem and diminished model performance due to heterogeneous bandwidth and non-IID data.
We introduce a bandwidth-aware compression framework for FL, aimed at improving communication efficiency while mitigating the problems associated with non-IID data.
arXiv Detail & Related papers (2024-08-27T02:28:27Z) - Differential error feedback for communication-efficient decentralized learning [48.924131251745266]
We propose a new decentralized communication-efficient learning approach that blends differential quantization with error feedback.
We show that the resulting communication-efficient strategy is stable both in terms of mean-square error and average bit rate.
The results establish that, in the small step-size regime and with a finite number of bits, it is possible to attain the performance achievable in the absence of compression.
arXiv Detail & Related papers (2024-06-26T15:11:26Z) - FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models [56.21666819468249]
Federated Learning (FL) has garnered increasing attention due to its unique characteristic of allowing heterogeneous clients to process their private data locally and interact with a central server.
We introduce FedComLoc, integrating practical and effective compression into emphScaffnew to further enhance communication efficiency.
arXiv Detail & Related papers (2024-03-14T22:29:59Z) - Communication-Efficient Distributed Learning with Local Immediate Error
Compensation [95.6828475028581]
We propose the Local Immediate Error Compensated SGD (LIEC-SGD) optimization algorithm.
LIEC-SGD is superior to previous works in either the convergence rate or the communication cost.
arXiv Detail & Related papers (2024-02-19T05:59:09Z) - Fed-CVLC: Compressing Federated Learning Communications with
Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds.
We show strong evidences that variable-length is beneficial for compression in FL.
We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z) - Unbiased Compression Saves Communication in Distributed Optimization:
When and How Much? [22.701976655513043]
Communication compression can alleviate communication overhead by transmitting compressed gradients and model parameters.
It is unclear whether communication compression reduces the total communication cost.
We show that using independent unbiased compression can reduce the total communication cost by a factor of up to $Theta(sqrtminn, kappa)$ when all local smoothness constants are constrained.
arXiv Detail & Related papers (2023-05-25T17:51:23Z) - Distributed Newton-Type Methods with Communication Compression and
Bernoulli Aggregation [11.870393751095083]
We study ommunication compression and aggregation mechanisms for curvature information.
New 3PC mechanisms, such as adaptive thresholding and Bernoulli aggregation, require reduced communication and occasional Hessian computations.
For all our methods, we derive fast condition-number-independent local linear and/or superlinear convergence rates.
arXiv Detail & Related papers (2022-06-07T21:12:21Z) - 1-Bit Compressive Sensing for Efficient Federated Learning Over the Air [32.14738452396869]
This paper develops and analyzes a communication-efficient scheme for learning (FL) over the air, which incorporates 1-bit sensing (CS) into analog aggregation transmissions.
For scalable computing, we develop an efficient implementation that is suitable for large-scale networks.
Simulation results show that our proposed 1-bit CS based FL over the air achieves comparable performance to the ideal case.
arXiv Detail & Related papers (2021-03-30T03:50:31Z) - A Linearly Convergent Algorithm for Decentralized Optimization: Sending
Less Bits for Free! [72.31332210635524]
Decentralized optimization methods enable on-device training of machine learning models without a central coordinator.
We propose a new randomized first-order method which tackles the communication bottleneck by applying randomized compression operators.
We prove that our method can solve the problems without any increase in the number of communications compared to the baseline.
arXiv Detail & Related papers (2020-11-03T13:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.