Related papers: Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks

Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks

URL: http://arxiv.org/abs/2210.07906v1
Date: Fri, 14 Oct 2022 15:43:57 GMT
Title: Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks
Authors: Cecilia Latotzke, Batuhan Balim, and Tobias Gemmeke
Abstract summary: The biggest challenge for the deployment of Deep Neural Networks (DNNs) close to the generated data on edge devices is their size, i.e., memory footprint and computational complexity. We propose a post-training quantization flow without the need for retraining. We excel state-of-the-art for 6 bit by 2.2% Top-1 accuracy for ImageNet.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The biggest challenge for the deployment of Deep Neural Networks (DNNs) close to the generated data on edge devices is their size, i.e., memory footprint and computational complexity. Both are significantly reduced with quantization. With the resulting lower word-length, the energy efficiency of DNNs increases proportionally. However, lower word-length typically causes accuracy degradation. To counteract this effect, the quantized DNN is retrained. Unfortunately, training costs up to 5000x more energy than the inference of the quantized DNN. To address this issue, we propose a post-training quantization flow without the need for retraining. For this, we investigated different quantization options. Furthermore, our analysis systematically assesses the impact of reduced word-lengths of weights and activations revealing a clear trend for the choice of word-length. Both aspects have not been systematically investigated so far. Our results are independent of the depth of the DNNs and apply to uniform quantization, allowing fast quantization of a given pre-trained DNN. We excel state-of-the-art for 6 bit by 2.2% Top-1 accuracy for ImageNet. Without retraining, our quantization to 8 bit surpasses floating-point accuracy.

Related papers

Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation [3.4606942690643336]
We introduce a novel approach for DNN quantization that uses a redundant representation of DNN's output. We demonstrate that this mapping can reduce quantization error. Our approach can be applied to other tasks, including segmentation, object detection, and key-points prediction.
arXiv Detail & Related papers (2024-05-22T21:59:46Z)
Designing strong baselines for ternary neural network quantization through support and mass equalization [7.971065005161565]
Deep neural networks (DNNs) offer the highest performance in a wide range of applications in computer vision. This computational burden can be dramatically reduced by quantizing floating point values to ternary values. We show experimentally that our approach allows to significantly improve the performance of ternary quantization through a variety of scenarios.
arXiv Detail & Related papers (2023-06-30T07:35:07Z)
QEBVerif: Quantization Error Bound Verification of Neural Networks [6.327780998441913]
quantization is widely regarded as one promising technique for deploying deep neural networks (DNNs) on edge devices. Existing verification methods focus on either individual neural networks (DNNs or QNNs) or quantization error bound for partial quantization. We propose a quantization error bound verification method, named QEBVerif, where both weights and activation tensors are quantized.
arXiv Detail & Related papers (2022-12-06T06:34:38Z)
OMPQ: Orthogonal Mixed Precision Quantization [64.59700856607017]
Mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization. We propose to optimize a proxy metric, the concept of networkity, which is highly correlated with the loss of the integer programming. This approach reduces the search time and required data amount by orders of magnitude, with little compromise on quantization accuracy.
arXiv Detail & Related papers (2021-09-16T10:59:33Z)
Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts. One promising approach to reduce the energy costs is representing DNNs with low-precision numbers. We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z)
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training [68.63354877166756]
ActNN is a memory-efficient training framework that stores randomly quantized activations for back propagation. ActNN reduces the memory footprint of the activation by 12x, and it enables training with a 6.6x to 14x larger batch size.
arXiv Detail & Related papers (2021-04-29T05:50:54Z)
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks [0.0]
We propose a new pruning method called Pruning for Quantization (PfQ) which removes the filters that disturb the fine-tuning of the DNN. Experiments using well-known models and datasets confirmed that the proposed method achieves higher performance with a similar model size.
arXiv Detail & Related papers (2020-11-13T04:12:54Z)
Subtensor Quantization for Mobilenets [5.735035463793008]
Quantization for deep neural networks (DNN) have enabled developers to deploy models with less memory and more efficient low-power inference. In this paper, we analyzed several root causes of quantization loss and proposed alternatives that do not rely on per-channel or training-aware approaches. We evaluate the image classification task on ImageNet dataset, and our post-training quantized 8-bit inference top-1 accuracy in within 0.7% of the floating point version.
arXiv Detail & Related papers (2020-11-04T15:41:47Z)
FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts. In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2. We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z)
Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors. This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.