Distance-aware Quantization
- URL: http://arxiv.org/abs/2108.06983v1
- Date: Mon, 16 Aug 2021 09:25:22 GMT
- Title: Distance-aware Quantization
- Authors: Dohyung kim, Junghyup Lee, Bumsub Ham
- Abstract summary: Quantization methods use a rounding function to map full-precision values to the nearest quantized ones.
We introduce a novel quantizer, dubbed a distance-aware quantizer (DAQ), that mainly consists of a distance-aware soft rounding (DASR) and a temperature controller.
- Score: 30.06895253269116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of network quantization, that is, reducing bit-widths
of weights and/or activations to lighten network architectures. Quantization
methods use a rounding function to map full-precision values to the nearest
quantized ones, but this operation is not differentiable. There are mainly two
approaches to training quantized networks with gradient-based optimizers.
First, a straight-through estimator (STE) replaces the zero derivative of the
rounding with that of an identity function, which causes a gradient mismatch
problem. Second, soft quantizers approximate the rounding with continuous
functions at training time, and exploit the rounding for quantization at test
time. This alleviates the gradient mismatch, but causes a quantizer gap
problem. We alleviate both problems in a unified framework. To this end, we
introduce a novel quantizer, dubbed a distance-aware quantizer (DAQ), that
mainly consists of a distance-aware soft rounding (DASR) and a temperature
controller. To alleviate the gradient mismatch problem, DASR approximates the
discrete rounding with the kernel soft argmax, which is based on our insight
that the quantization can be formulated as a distance-based assignment problem
between full-precision values and quantized ones. The controller adjusts the
temperature parameter in DASR adaptively according to the input, addressing the
quantizer gap problem. Experimental results on standard benchmarks show that
DAQ outperforms the state of the art significantly for various bit-widths
without bells and whistles.
Related papers
- Adaptive variational quantum dynamics simulations with compressed circuits and fewer measurements [4.2643127089535104]
We show an improved version of the adaptive variational quantum dynamics simulation (AVQDS) method, which we call AVQDS(T)
The algorithm adaptively adds layers of disjoint unitary gates to the ansatz circuit so as to keep the McLachlan distance, a measure of the accuracy of the variational dynamics, below a fixed threshold.
We also show a method based on eigenvalue truncation to solve the linear equations of motion for the variational parameters with enhanced noise resilience.
arXiv Detail & Related papers (2024-08-13T02:56:43Z) - GRAPE optimization for open quantum systems with time-dependent
decoherence rates driven by coherent and incoherent controls [77.34726150561087]
The GRadient Ascent Pulse Engineering (GRAPE) method is widely used for optimization in quantum control.
We adopt GRAPE method for optimizing objective functionals for open quantum systems driven by both coherent and incoherent controls.
The efficiency of the algorithm is demonstrated through numerical simulations for the state-to-state transition problem.
arXiv Detail & Related papers (2023-07-17T13:37:18Z) - Designing strong baselines for ternary neural network quantization
through support and mass equalization [7.971065005161565]
Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision.
This computational burden can be dramatically reduced by quantizing floating point values to ternary values.
We show experimentally that our approach allows to significantly improve the performance of ternary quantization through a variety of scenarios.
arXiv Detail & Related papers (2023-06-30T07:35:07Z) - Solving Oscillation Problem in Post-Training Quantization Through a
Theoretical Perspective [74.48124653728422]
Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically.
We argue that an overlooked problem of oscillation is in the PTQ methods.
arXiv Detail & Related papers (2023-03-21T14:52:52Z) - Green, Quantized Federated Learning over Wireless Networks: An
Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format.
The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z) - LG-LSQ: Learned Gradient Linear Symmetric Quantization [3.6816597150770387]
Deep neural networks with lower precision weights have advantages in terms of the cost of memory space and accelerator power.
The main challenge associated with the quantization algorithm is maintaining accuracy at low bit-widths.
We propose learned gradient linear symmetric quantization (LG-LSQ) as a method for quantizing weights and activation functions to low bit-widths.
arXiv Detail & Related papers (2022-02-18T03:38:12Z) - Variational Quantum Algorithms for Trace Distance and Fidelity
Estimation [7.247285982078057]
We introduce hybrid quantum-classical algorithms for two distance measures on near-term quantum devices.
First, we introduce the Variational Trace Distance Estimation (VTDE) algorithm.
Second, we introduce the Variational Fidelity Estimation (VFE) algorithm.
arXiv Detail & Related papers (2020-12-10T15:56:58Z) - Characterizing the loss landscape of variational quantum circuits [77.34726150561087]
We introduce a way to compute the Hessian of the loss function of VQCs.
We show how this information can be interpreted and compared to classical neural networks.
arXiv Detail & Related papers (2020-08-06T17:48:12Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - AUSN: Approximately Uniform Quantization by Adaptively Superimposing
Non-uniform Distribution for Deep Neural Networks [0.7378164273177589]
Existing uniform and non-uniform quantization methods exhibit an inherent conflict between the representing range and representing resolution.
We propose a novel quantization method to quantize the weight and activation.
The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN.
arXiv Detail & Related papers (2020-07-08T05:10:53Z) - Differentially Quantized Gradient Methods [53.3186247068836]
We show that Differentially Quantized Gradient Descent (DQ-GD) attains a linear contraction factor of $maxsigma_mathrmGD, rhon 2-R$.
No algorithm within a certain class can converge faster than $maxsigma_mathrmGD, 2-R$.
arXiv Detail & Related papers (2020-02-06T20:40:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.