Related papers: Communication-Efficient Federated Learning via Clipped Uniform Quantization

Communication-Efficient Federated Learning via Clipped Uniform Quantization

URL: http://arxiv.org/abs/2405.13365v2
Date: Sat, 14 Dec 2024 09:43:24 GMT
Title: Communication-Efficient Federated Learning via Clipped Uniform Quantization
Authors: Zavareh Bozorgasl, Hao Chen,
Abstract summary: This paper presents a novel approach to enhance communication efficiency in federated learning through clipped uniform quantization.<n>By leveraging optimal clipping thresholds and client-specific adaptive quantization schemes, the proposed method significantly reduces bandwidth and memory requirements for model weight transmission between clients and the server.<n>In contrast to federated averaging, this design obviates the need to disclose client-specific data volumes to the server, thereby enhancing client privacy.
Score: 3.38220960870904
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a novel approach to enhance communication efficiency in federated learning through clipped uniform quantization. By leveraging optimal clipping thresholds and client-specific adaptive quantization schemes, the proposed method significantly reduces bandwidth and memory requirements for model weight transmission between clients and the server while maintaining competitive accuracy. We investigate the effects of symmetric clipping and uniform quantization on model performance, emphasizing the role of stochastic quantization in mitigating artifacts and improving robustness. Extensive simulations demonstrate that the method achieves near-full-precision performance with substantial communication savings. Moreover, the proposed approach facilitates efficient weight averaging based on the inverse of the mean squared quantization errors, effectively balancing the trade-off between communication efficiency and model accuracy. Moreover, in contrast to federated averaging, this design obviates the need to disclose client-specific data volumes to the server, thereby enhancing client privacy. Comparative analysis with conventional quantization methods further confirms the efficacy of the proposed scheme.

Related papers

A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning [61.403275660120606]
Reinforcement learning (RL)-based fine-tuning has emerged as a powerful approach for aligning diffusion models with black-box objectives. We propose leave-one-out PPO (LOOP), a novel RL for diffusion fine-tuning method. Our results demonstrate that LOOP effectively improves diffusion models on various black-box objectives, and achieves a better balance between computational efficiency and performance.
arXiv Detail & Related papers (2025-03-02T13:43:53Z)
Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models. Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z)
HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization [55.972018549438964]
Federated fine-tuning of pre-trained Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving privacy.<n>We propose HAFLQ (Heterogeneous Adaptive Federated Low-Rank Adaptation Fine-tuned LLM with Quantization), a novel framework for efficient and scalable fine-tuning of LLMs in heterogeneous environments.<n> Experimental results on the text classification task demonstrate that HAFLQ reduces memory usage by 31%, lowers communication cost by 49%, improves accuracy by 50%, and achieves faster convergence compared to the baseline method.
arXiv Detail & Related papers (2024-11-10T19:59:54Z)
Saliency Assisted Quantization for Neural Networks [0.0]
This paper tackles the inherent black-box nature of deep learning models by providing real-time explanations during the training phase. We employ established quantization techniques to address resource constraints. To assess the effectiveness of our approach, we explore how quantization influences the interpretability and accuracy of Convolutional Neural Networks.
arXiv Detail & Related papers (2024-11-07T05:16:26Z)
MetaAug: Meta-Data Augmentation for Post-Training Quantization [32.02377559968568]
Post-Training Quantization (PTQ) has received significant attention because it requires only a small set of calibration data to quantize a full-precision model. We propose a novel meta-learning based approach to enhance the performance of post-training quantization.
arXiv Detail & Related papers (2024-07-20T02:18:51Z)
FedAQ: Communication-Efficient Federated Edge Learning via Joint Uplink and Downlink Adaptive Quantization [11.673528138087244]
Federated learning (FL) is a powerful machine learning paradigm which leverages the data as well as the computational resources of clients, while protecting clients' data privacy. Previous research has primarily focused on the uplink communication, employing either fixed-bit quantization or adaptive quantization methods. In this work, we introduce a holistic approach by joint uplink and downlink adaptive quantization to reduce the communication overhead.
arXiv Detail & Related papers (2024-06-26T08:14:23Z)
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [52.157939524815866]
In this paper, we identify imbalanced activation distributions as a primary source of quantization difficulty.<n>We propose to adjust these distributions through weight finetuning to be more quantization-friendly.<n>Our method demonstrates its efficacy across three high-resolution image generation tasks.
arXiv Detail & Related papers (2024-02-06T03:39:44Z)
Truncated Non-Uniform Quantization for Distributed SGD [17.30572818507568]
We introduce a novel two-stage quantization strategy to enhance the communication efficiency of distributed gradient Descent (SGD) The proposed method initially employs truncation to mitigate the impact of long-tail noise, followed by a non-uniform quantization of the post-truncation gradients based on their statistical characteristics. Our proposed algorithm outperforms existing quantization schemes, striking a superior balance between communication efficiency and convergence performance.
arXiv Detail & Related papers (2024-02-02T05:59:48Z)
Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models. We also propose a simple pre-training technique that leads to fully or partially shared models. Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z)
On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks [52.97107229149988]
We propose an On-Chip Hardware-Aware Quantization framework, performing hardware-aware mixed-precision quantization on deployed edge devices. For efficiency metrics, we built an On-Chip Quantization Aware pipeline, which allows the quantization process to perceive the actual hardware efficiency of the quantization operator. For accuracy metrics, we propose Mask-Guided Quantization Estimation technology to effectively estimate the accuracy impact of operators in the on-chip scenario.
arXiv Detail & Related papers (2023-09-05T04:39:34Z)
PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models [52.09865918265002]
We propose a novel quantize before fine-tuning'' framework, PreQuant. PreQuant is compatible with various quantization strategies, with outlier-aware fine-tuning incorporated to correct the induced quantization error. We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5.
arXiv Detail & Related papers (2023-05-30T08:41:33Z)
Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients. FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification. Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z)
Quantized Adaptive Subgradient Algorithms and Their Applications [39.103587572626026]
We propose quantized composite mirror descent adaptive subgradient (QCMD adagrad) and quantized regularized dual average adaptive subgradient (QRDA adagrad) for distributed training. A quantized gradient-based adaptive learning rate matrix is constructed to achieve a balance between communication costs, accuracy, and model sparsity.
arXiv Detail & Related papers (2022-08-11T04:04:03Z)
Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach [54.311495894129585]
We study the limit of communication cost of model aggregation in distributed learning from a rate-distortion perspective. It is found that the communication gain by exploiting the correlation between worker nodes is significant for SignSGD.
arXiv Detail & Related papers (2022-06-28T13:10:40Z)
ClusterQ: Semantic Feature Distribution Alignment for Data-Free Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ. To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics. We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z)
Optimizing the Communication-Accuracy Trade-off in Federated Learning with Rate-Distortion Theory [1.5771347525430772]
A significant bottleneck in federated learning is the network communication cost of sending model updates from client devices to the central server. Our method encodes quantized updates with an appropriate universal code, taking into account their empirical distribution. Because quantization introduces error, we select quantization levels by optimizing for the desired trade-off in average total gradient and distortion.
arXiv Detail & Related papers (2022-01-07T20:17:33Z)
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning [75.45968495410047]
Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning. Gradient quantization is an effective way of reducing the number of bits required to communicate each model update. We propose an adaptive quantization strategy called AdaFL that aims to achieve communication efficiency as well as a low error floor.
arXiv Detail & Related papers (2021-02-08T19:14:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.