Histogram-Equalized Quantization for logic-gated Residual Neural Networks
- URL: http://arxiv.org/abs/2501.04517v2
- Date: Thu, 09 Jan 2025 09:00:02 GMT
- Title: Histogram-Equalized Quantization for logic-gated Residual Neural Networks
- Authors: Van Thien Nguyen, William Guicquero, Gilles Sicard,
- Abstract summary: Histogram-Equalized Quantization (HEQ) is an adaptive framework for linear symmetric quantization.<n>HEQ automatically adapts the quantization thresholds using a unique step size optimization.<n> Experiments on the STL-10 dataset even show that HEQ enables a proper training of our proposed logic-gated (OR, MUX) residual networks.
- Score: 2.7036595757881323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adjusting the quantization according to the data or to the model loss seems mandatory to enable a high accuracy in the context of quantized neural networks. This work presents Histogram-Equalized Quantization (HEQ), an adaptive framework for linear symmetric quantization. HEQ automatically adapts the quantization thresholds using a unique step size optimization. We empirically show that HEQ achieves state-of-the-art performances on CIFAR-10. Experiments on the STL-10 dataset even show that HEQ enables a proper training of our proposed logic-gated (OR, MUX) residual networks with a higher accuracy at a lower hardware complexity than previous work.
Related papers
- Precision Neural Network Quantization via Learnable Adaptive Modules [27.323901068182234]
Quantization Aware Training (QAT) is a neural network quantization technique that compresses model size and improves operational efficiency.
We propose an effective learnable adaptive neural network quantization method, called Adaptive Step Size Quantization (ASQ)
arXiv Detail & Related papers (2025-04-24T05:46:25Z) - Gradient-based Automatic Mixed Precision Quantization for Neural Networks On-Chip [0.9187138676564589]
We present High Granularity Quantization (HGQ), an innovative quantization-aware training method.
HGQ fine-tune the per-weight and per-activation precision by making them optimizable through gradient descent.
This approach enables ultra-low latency and low power neural networks on hardware capable of performing arithmetic operations.
arXiv Detail & Related papers (2024-05-01T17:18:46Z) - Quantized Approximately Orthogonal Recurrent Neural Networks [6.524758376347808]
We explore the quantization of the weight matrices in ORNNs, leading to Quantized approximately Orthogonal RNNs (QORNNs)
We propose and investigate two strategies to learn QORNN by combining quantization-aware training (QAT) and computation projections.
The most efficient models achieve results similar to state-of-the-art full-precision ORNN, LSTM and FastRNN on a variety of standard benchmarks, even with 4-bits quantization.
arXiv Detail & Related papers (2024-02-05T09:59:57Z) - Tensor Ring Optimized Quantum-Enhanced Tensor Neural Networks [32.76948546010625]
Quantum machine learning researchers often rely on incorporating Networks (TN) into Deep Neural Networks (DNN)
To address this issue, a multi-layer design of a Ring optimized variational Quantum learning classifier (Quan-TR) is proposed.
It is referred to as Ring optimized Quantum-enhanced neural Networks (TR-QNet)
On quantum simulations, the proposed TR-QNet achieves promising accuracy of $94.5%$, $86.16%$, and $83.54%$ on the Iris, MNIST, and CIFAR-10 datasets, respectively.
arXiv Detail & Related papers (2023-10-02T18:07:10Z) - Designing strong baselines for ternary neural network quantization
through support and mass equalization [7.971065005161565]
Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision.
This computational burden can be dramatically reduced by quantizing floating point values to ternary values.
We show experimentally that our approach allows to significantly improve the performance of ternary quantization through a variety of scenarios.
arXiv Detail & Related papers (2023-06-30T07:35:07Z) - Vertical Layering of Quantized Neural Networks for Heterogeneous
Inference [57.42762335081385]
We study a new vertical-layered representation of neural network weights for encapsulating all quantized models into a single one.
We can theoretically achieve any precision network for on-demand service while only needing to train and maintain one model.
arXiv Detail & Related papers (2022-12-10T15:57:38Z) - Theoretical Error Performance Analysis for Variational Quantum Circuit
Based Functional Regression [83.79664725059877]
In this work, we put forth an end-to-end quantum neural network, namely, TTN-VQC, for dimensionality reduction and functional regression.
We also characterize the optimization properties of TTN-VQC by leveraging the Polyak-Lojasiewicz (PL) condition.
arXiv Detail & Related papers (2022-06-08T06:54:07Z) - A Statistical Framework for Low-bitwidth Training of Deep Neural
Networks [70.77754244060384]
Fully quantized training (FQT) uses low-bitwidth hardware by quantizing the activations, weights, and gradients of a neural network model.
One major challenge with FQT is the lack of theoretical understanding, in particular of how gradient quantization impacts convergence properties.
arXiv Detail & Related papers (2020-10-27T13:57:33Z) - Once Quantization-Aware Training: High Performance Extremely Low-bit
Architecture Search [112.05977301976613]
We propose to combine Network Architecture Search methods with quantization to enjoy the merits of the two sides.
We first propose the joint training of architecture and quantization with a shared step size to acquire a large number of quantized models.
Then a bit-inheritance scheme is introduced to transfer the quantized models to the lower bit, which further reduces the time cost and improves the quantization accuracy.
arXiv Detail & Related papers (2020-10-09T03:52:16Z) - Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized
Neural Networks [31.168156284218746]
We propose a novel Asymptotic-Quantized Estimator (AQE) to estimate the gradient.
At the end of training, the weights and activations have been quantized to low-precision.
In the inference phase, we can use XNOR or SHIFT operations instead of convolution operations to accelerate the MINW-Net.
arXiv Detail & Related papers (2020-03-04T03:17:47Z) - Gradient $\ell_1$ Regularization for Quantization Robustness [70.39776106458858]
We derive a simple regularization scheme that improves robustness against post-training quantization.
By training quantization-ready networks, our approach enables storing a single set of weights that can be quantized on-demand to different bit-widths.
arXiv Detail & Related papers (2020-02-18T12:31:34Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.