Clipped Uniform Quantizers for Communication-Efficient Federated Learning
- URL: http://arxiv.org/abs/2405.13365v1
- Date: Wed, 22 May 2024 05:48:25 GMT
- Title: Clipped Uniform Quantizers for Communication-Efficient Federated Learning
- Authors: Zavareh Bozorgasl, Hao Chen,
- Abstract summary: This paper introduces an approach to employ clipped uniform quantization in federated learning settings.
By employing optimal clipping thresholds and adaptive quantization schemes, our method significantly curtails the bit requirements for model weight transmissions.
- Score: 3.38220960870904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces an approach to employ clipped uniform quantization in federated learning settings, aiming to enhance model efficiency by reducing communication overhead without compromising accuracy. By employing optimal clipping thresholds and adaptive quantization schemes, our method significantly curtails the bit requirements for model weight transmissions between clients and the server. We explore the implications of symmetric clipping and uniform quantization on model performance, highlighting the utility of stochastic quantization to mitigate quantization artifacts and improve model robustness. Through extensive simulations on the MNIST dataset, our results demonstrate that the proposed method achieves near full-precision performance while ensuring substantial communication savings. Specifically, our approach facilitates efficient weight averaging based on quantization errors, effectively balancing the trade-off between communication efficiency and model accuracy. The comparative analysis with conventional quantization methods further confirms the superiority of our technique.
Related papers
- MetaAug: Meta-Data Augmentation for Post-Training Quantization [32.02377559968568]
Post-Training Quantization (PTQ) has received significant attention because it requires only a small set of calibration data to quantize a full-precision model.
We propose a novel meta-learning based approach to enhance the performance of post-training quantization.
arXiv Detail & Related papers (2024-07-20T02:18:51Z) - FedAQ: Communication-Efficient Federated Edge Learning via Joint Uplink and Downlink Adaptive Quantization [11.673528138087244]
Federated learning (FL) is a powerful machine learning paradigm which leverages the data as well as the computational resources of clients, while protecting clients' data privacy.
Previous research has primarily focused on the uplink communication, employing either fixed-bit quantization or adaptive quantization methods.
In this work, we introduce a holistic approach by joint uplink and downlink adaptive quantization to reduce the communication overhead.
arXiv Detail & Related papers (2024-06-26T08:14:23Z) - Truncated Non-Uniform Quantization for Distributed SGD [17.30572818507568]
We introduce a novel two-stage quantization strategy to enhance the communication efficiency of distributed gradient Descent (SGD)
The proposed method initially employs truncation to mitigate the impact of long-tail noise, followed by a non-uniform quantization of the post-truncation gradients based on their statistical characteristics.
Our proposed algorithm outperforms existing quantization schemes, striking a superior balance between communication efficiency and convergence performance.
arXiv Detail & Related papers (2024-02-02T05:59:48Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z) - On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks [52.97107229149988]
We propose an On-Chip Hardware-Aware Quantization framework, performing hardware-aware mixed-precision quantization on deployed edge devices.
For efficiency metrics, we built an On-Chip Quantization Aware pipeline, which allows the quantization process to perceive the actual hardware efficiency of the quantization operator.
For accuracy metrics, we propose Mask-Guided Quantization Estimation technology to effectively estimate the accuracy impact of operators in the on-chip scenario.
arXiv Detail & Related papers (2023-09-05T04:39:34Z) - Quantization Aware Factorization for Deep Neural Network Compression [20.04951101799232]
decomposition of convolutional and fully-connected layers is an effective way to reduce parameters and FLOP in neural networks.
A conventional post-training quantization approach applied to networks with weights yields a drop in accuracy.
This motivated us to develop an algorithm that finds decomposed approximation directly with quantized factors.
arXiv Detail & Related papers (2023-08-08T21:38:02Z) - PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language
Models [52.09865918265002]
We propose a novel quantize before fine-tuning'' framework, PreQuant.
PreQuant is compatible with various quantization strategies, with outlier-aware fine-tuning incorporated to correct the induced quantization error.
We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5.
arXiv Detail & Related papers (2023-05-30T08:41:33Z) - Quantized Adaptive Subgradient Algorithms and Their Applications [39.103587572626026]
We propose quantized composite mirror descent adaptive subgradient (QCMD adagrad) and quantized regularized dual average adaptive subgradient (QRDA adagrad) for distributed training.
A quantized gradient-based adaptive learning rate matrix is constructed to achieve a balance between communication costs, accuracy, and model sparsity.
arXiv Detail & Related papers (2022-08-11T04:04:03Z) - Fundamental Limits of Communication Efficiency for Model Aggregation in
Distributed Learning: A Rate-Distortion Approach [54.311495894129585]
We study the limit of communication cost of model aggregation in distributed learning from a rate-distortion perspective.
It is found that the communication gain by exploiting the correlation between worker nodes is significant for SignSGD.
arXiv Detail & Related papers (2022-06-28T13:10:40Z) - ClusterQ: Semantic Feature Distribution Alignment for Data-Free
Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ.
To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics.
We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z) - Adaptive Quantization of Model Updates for Communication-Efficient
Federated Learning [75.45968495410047]
Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning.
Gradient quantization is an effective way of reducing the number of bits required to communicate each model update.
We propose an adaptive quantization strategy called AdaFL that aims to achieve communication efficiency as well as a low error floor.
arXiv Detail & Related papers (2021-02-08T19:14:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.