DAQ: Distribution-Aware Quantization for Deep Image Super-Resolution
Networks
- URL: http://arxiv.org/abs/2012.11230v1
- Date: Mon, 21 Dec 2020 10:19:42 GMT
- Title: DAQ: Distribution-Aware Quantization for Deep Image Super-Resolution
Networks
- Authors: Cheeun Hong, Heewon Kim, Junghun Oh, Kyoung Mu Lee
- Abstract summary: Quantizing deep convolutional neural networks for image super-resolution substantially reduces their computational costs.
Existing works either suffer from a severe performance drop in ultra-low precision of 4 or lower bit-widths, or require a heavy fine-tuning process to recover the performance.
We propose a novel distribution-aware quantization scheme (DAQ) which facilitates accurate training-free quantization in ultra-low precision.
- Score: 49.191062785007006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantizing deep convolutional neural networks for image super-resolution
substantially reduces their computational costs. However, existing works either
suffer from a severe performance drop in ultra-low precision of 4 or lower
bit-widths, or require a heavy fine-tuning process to recover the performance.
To our knowledge, this vulnerability to low precisions relies on two
statistical observations of feature map values. First, distribution of feature
map values varies significantly per channel and per input image. Second,
feature maps have outliers that can dominate the quantization error. Based on
these observations, we propose a novel distribution-aware quantization scheme
(DAQ) which facilitates accurate training-free quantization in ultra-low
precision. A simple function of DAQ determines dynamic range of feature maps
and weights with low computational burden. Furthermore, our method enables
mixed-precision quantization by calculating the relative sensitivity of each
channel, without any training process involved. Nonetheless, quantization-aware
training is also applicable for auxiliary performance gain. Our new method
outperforms recent training-free and even training-based quantization methods
to the state-of-the-art image super-resolution networks in ultra-low precision.
Related papers
- PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [87.89013794655207]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.
We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.
Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z) - Gradient-based Automatic Mixed Precision Quantization for Neural Networks On-Chip [0.9187138676564589]
We present High Granularity Quantization (HGQ), an innovative quantization-aware training method.
HGQ fine-tune the per-weight and per-activation precision by making them optimizable through gradient descent.
This approach enables ultra-low latency and low power neural networks on hardware capable of performing arithmetic operations.
arXiv Detail & Related papers (2024-05-01T17:18:46Z) - Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks [53.23803932357899]
quantization leads to accuracy loss in image super-resolution (SR) networks.
Existing works address this distribution mismatch problem by dynamically adapting quantization ranges during test time.
We propose a new quantization-aware training scheme that effectively Overcomes the Distribution Mismatch problem in SR networks.
arXiv Detail & Related papers (2023-07-25T08:50:01Z) - Automatic Network Adaptation for Ultra-Low Uniform-Precision
Quantization [6.1664476076961146]
Uniform-precision neural network quantization has gained popularity since it simplifies densely packed arithmetic unit for high computing capability.
It ignores heterogeneous sensitivity to the impact of quantization errors across the layers, resulting in sub-optimal inference.
This work proposes a novel neural architecture search called neural channel expansion that adjusts the network structure to alleviate accuracy degradation from ultra-low uniform-precision quantization.
arXiv Detail & Related papers (2022-12-21T09:41:25Z) - Neural Networks with Quantization Constraints [111.42313650830248]
We present a constrained learning approach to quantization training.
We show that the resulting problem is strongly dual and does away with gradient estimations.
We demonstrate that the proposed approach exhibits competitive performance in image classification tasks.
arXiv Detail & Related papers (2022-10-27T17:12:48Z) - Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z) - Quantized Proximal Averaging Network for Analysis Sparse Coding [23.080395291046408]
We unfold an iterative algorithm into a trainable network that facilitates learning sparsity prior to quantization.
We demonstrate applications to compressed image recovery and magnetic resonance image reconstruction.
arXiv Detail & Related papers (2021-05-13T12:05:35Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - Accelerating Neural Network Inference by Overflow Aware Quantization [16.673051600608535]
Inherited heavy computation of deep neural networks prevents their widespread applications.
We propose an overflow aware quantization method by designing trainable adaptive fixed-point representation.
With the proposed method, we are able to fully utilize the computing power to minimize the quantization loss and obtain optimized inference performance.
arXiv Detail & Related papers (2020-05-27T11:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.