Related papers: Distance-aware Quantization

Distance-aware Quantization

URL: http://arxiv.org/abs/2108.06983v1
Date: Mon, 16 Aug 2021 09:25:22 GMT
Title: Distance-aware Quantization
Authors: Dohyung kim, Junghyup Lee, Bumsub Ham
Abstract summary: Quantization methods use a rounding function to map full-precision values to the nearest quantized ones. We introduce a novel quantizer, dubbed a distance-aware quantizer (DAQ), that mainly consists of a distance-aware soft rounding (DASR) and a temperature controller.
Score: 30.06895253269116
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the problem of network quantization, that is, reducing bit-widths of weights and/or activations to lighten network architectures. Quantization methods use a rounding function to map full-precision values to the nearest quantized ones, but this operation is not differentiable. There are mainly two approaches to training quantized networks with gradient-based optimizers. First, a straight-through estimator (STE) replaces the zero derivative of the rounding with that of an identity function, which causes a gradient mismatch problem. Second, soft quantizers approximate the rounding with continuous functions at training time, and exploit the rounding for quantization at test time. This alleviates the gradient mismatch, but causes a quantizer gap problem. We alleviate both problems in a unified framework. To this end, we introduce a novel quantizer, dubbed a distance-aware quantizer (DAQ), that mainly consists of a distance-aware soft rounding (DASR) and a temperature controller. To alleviate the gradient mismatch problem, DASR approximates the discrete rounding with the kernel soft argmax, which is based on our insight that the quantization can be formulated as a distance-based assignment problem between full-precision values and quantized ones. The controller adjusts the temperature parameter in DASR adaptively according to the input, addressing the quantizer gap problem. Experimental results on standard benchmarks show that DAQ outperforms the state of the art significantly for various bit-widths without bells and whistles.

Related papers

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate [13.14434628836727]
Vector quantization aims to quantize high-dimensional Euclidean vectors while minimizing distortion in their geometric structure. We propose TurboQuant to address both mean-squared error (MSE) and inner product distortion. Our data-oblivious algorithms, suitable for online applications, achieve near-optimal distortion rates.
arXiv Detail & Related papers (2025-04-28T15:05:35Z)
CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution [59.91470739501034]
We propose CondiQuant, a condition number based low-bit post-training quantization for image super-resolution. We show that CondiQuant outperforms existing state-of-the-art post-training quantization methods in accuracy without computation overhead.
arXiv Detail & Related papers (2025-02-21T14:04:30Z)
Adaptive variational quantum dynamics simulations with compressed circuits and fewer measurements [4.2643127089535104]
We show an improved version of the adaptive variational quantum dynamics simulation (AVQDS) method, which we call AVQDS(T) The algorithm adaptively adds layers of disjoint unitary gates to the ansatz circuit so as to keep the McLachlan distance, a measure of the accuracy of the variational dynamics, below a fixed threshold. We also show a method based on eigenvalue truncation to solve the linear equations of motion for the variational parameters with enhanced noise resilience.
arXiv Detail & Related papers (2024-08-13T02:56:43Z)
GRAPE optimization for open quantum systems with time-dependent decoherence rates driven by coherent and incoherent controls [77.34726150561087]
The GRadient Ascent Pulse Engineering (GRAPE) method is widely used for optimization in quantum control. We adopt GRAPE method for optimizing objective functionals for open quantum systems driven by both coherent and incoherent controls. The efficiency of the algorithm is demonstrated through numerical simulations for the state-to-state transition problem.
arXiv Detail & Related papers (2023-07-17T13:37:18Z)
Designing strong baselines for ternary neural network quantization through support and mass equalization [7.971065005161565]
Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision. This computational burden can be dramatically reduced by quantizing floating point values to ternary values. We show experimentally that our approach allows to significantly improve the performance of ternary quantization through a variety of scenarios.
arXiv Detail & Related papers (2023-06-30T07:35:07Z)
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective [74.48124653728422]
Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically. We argue that an overlooked problem of oscillation is in the PTQ methods.
arXiv Detail & Related papers (2023-03-21T14:52:52Z)
Green, Quantized Federated Learning over Wireless Networks: An Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format. The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z)
LG-LSQ: Learned Gradient Linear Symmetric Quantization [3.6816597150770387]
Deep neural networks with lower precision weights have advantages in terms of the cost of memory space and accelerator power. The main challenge associated with the quantization algorithm is maintaining accuracy at low bit-widths. We propose learned gradient linear symmetric quantization (LG-LSQ) as a method for quantizing weights and activation functions to low bit-widths.
arXiv Detail & Related papers (2022-02-18T03:38:12Z)
Variational Quantum Algorithms for Trace Distance and Fidelity Estimation [7.247285982078057]
We introduce hybrid quantum-classical algorithms for two distance measures on near-term quantum devices. First, we introduce the Variational Trace Distance Estimation (VTDE) algorithm. Second, we introduce the Variational Fidelity Estimation (VFE) algorithm.
arXiv Detail & Related papers (2020-12-10T15:56:58Z)
Characterizing the loss landscape of variational quantum circuits [77.34726150561087]
We introduce a way to compute the Hessian of the loss function of VQCs. We show how this information can be interpreted and compared to classical neural networks.
arXiv Detail & Related papers (2020-08-06T17:48:12Z)
AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation. Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)
AUSN: Approximately Uniform Quantization by Adaptively Superimposing Non-uniform Distribution for Deep Neural Networks [0.7378164273177589]
Existing uniform and non-uniform quantization methods exhibit an inherent conflict between the representing range and representing resolution. We propose a novel quantization method to quantize the weight and activation. The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN.
arXiv Detail & Related papers (2020-07-08T05:10:53Z)
Differentially Quantized Gradient Methods [53.3186247068836]
We show that Differentially Quantized Gradient Descent (DQ-GD) attains a linear contraction factor of $maxsigma_mathrmGD, rhon 2-R$. No algorithm within a certain class can converge faster than $maxsigma_mathrmGD, 2-R$.
arXiv Detail & Related papers (2020-02-06T20:40:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.