PAMS: Quantized Super-Resolution via Parameterized Max Scale
- URL: http://arxiv.org/abs/2011.04212v1
- Date: Mon, 9 Nov 2020 06:16:05 GMT
- Title: PAMS: Quantized Super-Resolution via Parameterized Max Scale
- Authors: Huixia Li, Chenqian Yan, Shaohui Lin, Xiawu Zheng, Yuchao Li, Baochang
Zhang, Fan Yang, Rongrong Ji
- Abstract summary: Deep convolutional neural networks (DCNNs) have shown dominant performance in the task of super-resolution (SR)
We propose a new quantization scheme termed PArameterized Max Scale (PAMS), which applies the trainable truncated parameter to explore the upper bound of the quantization range adaptively.
Experiments demonstrate that the proposed PAMS scheme can well compress and accelerate the existing SR models such as EDSR and RDN.
- Score: 84.55675222525608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep convolutional neural networks (DCNNs) have shown dominant performance in
the task of super-resolution (SR). However, their heavy memory cost and
computation overhead significantly restrict their practical deployments on
resource-limited devices, which mainly arise from the floating-point storage
and operations between weights and activations. Although previous endeavors
mainly resort to fixed-point operations, quantizing both weights and
activations with fixed coding lengths may cause significant performance drop,
especially on low bits. Specifically, most state-of-the-art SR models without
batch normalization have a large dynamic quantization range, which also serves
as another cause of performance drop. To address these two issues, we propose a
new quantization scheme termed PArameterized Max Scale (PAMS), which applies
the trainable truncated parameter to explore the upper bound of the
quantization range adaptively. Finally, a structured knowledge transfer (SKT)
loss is introduced to fine-tune the quantized network. Extensive experiments
demonstrate that the proposed PAMS scheme can well compress and accelerate the
existing SR models such as EDSR and RDN. Notably, 8-bit PAMS-EDSR improves PSNR
on Set5 benchmark from 32.095dB to 32.124dB with 2.42$\times$ compression
ratio, which achieves a new state-of-the-art.
Related papers
- 2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution [83.09117439860607]
Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment.
It is notorious that low-bit quantization degrades the accuracy of SR models compared to their full-precision (FP) counterparts.
We present a dual-stage low-bit post-training quantization (PTQ) method for image super-resolution, namely 2DQuant, which achieves efficient and accurate SR under low-bit quantization.
arXiv Detail & Related papers (2024-06-10T06:06:11Z) - SqueezeLLM: Dense-and-Sparse Quantization [80.32162537942138]
Main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, for single batch inference.
We introduce SqueezeLLM, a post-training quantization framework that enables lossless compression to ultra-low precisions of up to 3-bit.
Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format.
arXiv Detail & Related papers (2023-06-13T08:57:54Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for
Energy-Efficient Low-precision Deep Convolutional Neural Networks [13.520972975766313]
We propose a framework to enable energy-efficient low-precision deep convolutional neural network inference by exploiting non-uniform quantization of weights.
We also propose a novel data representation format, Encoded Low-Precision Binary Signed Digit, to compress the bit-width of weights.
arXiv Detail & Related papers (2022-07-31T01:34:56Z) - Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution
Networks [82.18396309806577]
We propose a novel activation quantizer, referred to as Dynamic Dual Trainable Bounds (DDTB)
Our DDTB exhibits significant performance improvements in ultra-low precision.
For example, our DDTB achieves a 0.70dB PSNR increase on Urban100 benchmark when quantizing EDSR to 2-bit and scaling up output images to x4.
arXiv Detail & Related papers (2022-03-08T04:26:18Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - Differentiable Joint Pruning and Quantization for Hardware Efficiency [16.11027058505213]
DJPQ incorporates variational information bottleneck based structured pruning and mixed-bit precision quantization into a single differentiable loss function.
We show that DJPQ significantly reduces the number of Bit-Operations (BOPs) for several networks while maintaining the top-1 accuracy of original floating-point models.
arXiv Detail & Related papers (2020-07-20T20:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.