Truncated Non-Uniform Quantization for Distributed SGD
- URL: http://arxiv.org/abs/2402.01160v1
- Date: Fri, 2 Feb 2024 05:59:48 GMT
- Title: Truncated Non-Uniform Quantization for Distributed SGD
- Authors: Guangfeng Yan, Tan Li, Yuanzhang Xiao, Congduan Li and Linqi Song
- Abstract summary: We introduce a novel two-stage quantization strategy to enhance the communication efficiency of distributed gradient Descent (SGD)
The proposed method initially employs truncation to mitigate the impact of long-tail noise, followed by a non-uniform quantization of the post-truncation gradients based on their statistical characteristics.
Our proposed algorithm outperforms existing quantization schemes, striking a superior balance between communication efficiency and convergence performance.
- Score: 17.30572818507568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To address the communication bottleneck challenge in distributed learning,
our work introduces a novel two-stage quantization strategy designed to enhance
the communication efficiency of distributed Stochastic Gradient Descent (SGD).
The proposed method initially employs truncation to mitigate the impact of
long-tail noise, followed by a non-uniform quantization of the post-truncation
gradients based on their statistical characteristics. We provide a
comprehensive convergence analysis of the quantized distributed SGD,
establishing theoretical guarantees for its performance. Furthermore, by
minimizing the convergence error, we derive optimal closed-form solutions for
the truncation threshold and non-uniform quantization levels under given
communication constraints. Both theoretical insights and extensive experimental
evaluations demonstrate that our proposed algorithm outperforms existing
quantization schemes, striking a superior balance between communication
efficiency and convergence performance.
Related papers
- Communication-Efficient Federated Learning by Quantized Variance Reduction for Heterogeneous Wireless Edge Networks [55.467288506826755]
Federated learning (FL) has been recognized as a viable solution for local-privacy-aware collaborative model training in wireless edge networks.
Most existing communication-efficient FL algorithms fail to reduce the significant inter-device variance.
We propose a novel communication-efficient FL algorithm, named FedQVR, which relies on a sophisticated variance-reduced scheme.
arXiv Detail & Related papers (2025-01-20T04:26:21Z) - A Quantum Genetic Algorithm Framework for the MaxCut Problem [49.59986385400411]
The proposed method introduces a Quantum Genetic Algorithm (QGA) using a Grover-based evolutionary framework and divide-and-conquer principles.
On complete graphs, the proposed method consistently achieves the true optimal MaxCut values, outperforming the Semidefinite Programming (SDP) approach.
On ErdHos-R'enyi random graphs, the QGA demonstrates competitive performance, achieving median solutions within 92-96% of the SDP results.
arXiv Detail & Related papers (2025-01-02T05:06:16Z) - Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.
We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.
Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - Communication-Efficient Federated Learning via Clipped Uniform Quantization [3.38220960870904]
This paper presents a novel approach to enhance communication efficiency in federated learning through clipped uniform quantization.
By leveraging optimal clipping thresholds and client-specific adaptive quantization schemes, the proposed method significantly reduces bandwidth and memory requirements for model weight transmission between clients and the server.
In contrast to federated averaging, this design obviates the need to disclose client-specific data volumes to the server, thereby enhancing client privacy.
arXiv Detail & Related papers (2024-05-22T05:48:25Z) - Efficient Text-driven Motion Generation via Latent Consistency Training [21.348658259929053]
We propose a motion latent consistency training framework (MLCT) to solve nonlinear reverse diffusion trajectories.
By combining these enhancements, we achieve stable and consistency training in non-pixel modality and latent representation spaces.
arXiv Detail & Related papers (2024-05-05T02:11:57Z) - Quantization Avoids Saddle Points in Distributed Optimization [1.579622195923387]
Distributed non optimization underpins key functionalities of numerous distributed systems.
The aim of this paper is to prove that it can effectively escape saddle points convergence to a second-order stationary point convergence.
With an easily adjustable quantization, the approach allows a user control to aggressively reduce communication overhead.
arXiv Detail & Related papers (2024-03-15T15:58:20Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Improved Quantization Strategies for Managing Heavy-tailed Gradients in
Distributed Learning [20.91559450517002]
It is observed that gradient distributions are heavy-tailed, with outliers significantly influencing the design of compression strategies.
Existing parameter quantization methods experience performance degradation when this heavy-tailed feature is ignored.
We introduce a novel compression scheme specifically engineered for heavy-tailed gradient gradients, which effectively combines truncation with quantization.
arXiv Detail & Related papers (2024-02-02T06:14:31Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Near-Term Distributed Quantum Computation using Mean-Field Corrections
and Auxiliary Qubits [77.04894470683776]
We propose near-term distributed quantum computing that involve limited information transfer and conservative entanglement production.
We build upon these concepts to produce an approximate circuit-cutting technique for the fragmented pre-training of variational quantum algorithms.
arXiv Detail & Related papers (2023-09-11T18:00:00Z) - Wireless Quantized Federated Learning: A Joint Computation and
Communication Design [36.35684767732552]
In this paper, we aim to minimize the total convergence time of FL, by quantizing the local model parameters prior to uplink transmission.
We jointly optimize the computing, communication resources and number of quantization bits, in order to guarantee minimized convergence time across all global rounds.
arXiv Detail & Related papers (2022-03-11T12:30:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.