Distribution-Flexible Subset Quantization for Post-Quantizing
Super-Resolution Networks
- URL: http://arxiv.org/abs/2305.05888v2
- Date: Fri, 12 May 2023 04:43:47 GMT
- Title: Distribution-Flexible Subset Quantization for Post-Quantizing
Super-Resolution Networks
- Authors: Yunshan Zhong, Mingbao Lin, Jingjing Xie, Yuxin Zhang, Fei Chao,
Rongrong Ji
- Abstract summary: This paper introduces Distribution-Flexible Subset Quantization (DFSQ), a post-training quantization method for super-resolution networks.
DFSQ conducts channel-wise normalization of the activations and applies distribution-flexible subset quantization (SQ)
It achieves comparable performance to full-precision counterparts on 6- and 8-bit quantization, and incurs only a 0.1 dB PSNR drop on 4-bit quantization.
- Score: 68.83451203841624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces Distribution-Flexible Subset Quantization (DFSQ), a
post-training quantization method for super-resolution networks. Our motivation
for developing DFSQ is based on the distinctive activation distributions of
current super-resolution models, which exhibit significant variance across
samples and channels. To address this issue, DFSQ conducts channel-wise
normalization of the activations and applies distribution-flexible subset
quantization (SQ), wherein the quantization points are selected from a
universal set consisting of multi-word additive log-scale values. To expedite
the selection of quantization points in SQ, we propose a fast quantization
points selection strategy that uses K-means clustering to select the
quantization points closest to the centroids. Compared to the common iterative
exhaustive search algorithm, our strategy avoids the enumeration of all
possible combinations in the universal set, reducing the time complexity from
exponential to linear. Consequently, the constraint of time costs on the size
of the universal set is greatly relaxed. Extensive evaluations of various
super-resolution models show that DFSQ effectively retains performance even
without fine-tuning. For example, when quantizing EDSRx2 on the Urban
benchmark, DFSQ achieves comparable performance to full-precision counterparts
on 6- and 8-bit quantization, and incurs only a 0.1 dB PSNR drop on 4-bit
quantization. Code is at \url{https://github.com/zysxmu/DFSQ}
Related papers
- EQ-Net: Elastic Quantization Neural Networks [15.289359357583079]
Elastic Quantization Neural Networks (EQ-Net) aims to train a robust weight-sharing quantization supernet.
We propose an elastic quantization space (including elastic bit-width, granularity, and symmetry) to adapt to various mainstream quantitative forms.
We incorporate genetic algorithms and the proposed Conditional Quantization-Aware Conditional Accuracy Predictor (CQAP) as an estimator to quickly search mixed-precision quantized neural networks in supernet.
arXiv Detail & Related papers (2023-08-15T08:57:03Z) - QFT: Post-training quantization via fast joint finetuning of all degrees
of freedom [1.1744028458220428]
We rethink quantized network parameterization in HW-aware fashion, towards a unified analysis of all quantization DoF.
Our single-step simple and extendable method, dubbed quantization-aware finetuning (QFT), achieves 4-bit weight quantization results on-par with SoTA.
arXiv Detail & Related papers (2022-12-05T22:38:58Z) - End-to-end resource analysis for quantum interior point methods and portfolio optimization [63.4863637315163]
We provide a complete quantum circuit-level description of the algorithm from problem input to problem output.
We report the number of logical qubits and the quantity/depth of non-Clifford T-gates needed to run the algorithm.
arXiv Detail & Related papers (2022-11-22T18:54:48Z) - Green, Quantized Federated Learning over Wireless Networks: An
Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format.
The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z) - Learning Quantile Functions without Quantile Crossing for
Distribution-free Time Series Forecasting [12.269597033369557]
We propose the Incremental (Spline) Quantile Functions I(S)QF, a flexible and efficient distribution-free quantile estimation framework.
We also provide a generalization error analysis of our proposed approaches under the sequence-to-sequence setting.
arXiv Detail & Related papers (2021-11-12T06:54:48Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Q-Match: Iterative Shape Matching via Quantum Annealing [64.74942589569596]
Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP)
This paper proposes Q-Match, a new iterative quantum method for QAPs inspired by the alpha-expansion algorithm.
Q-Match can be applied for shape matching problems iteratively, on a subset of well-chosen correspondences, allowing us to scale to real-world problems.
arXiv Detail & Related papers (2021-05-06T17:59:38Z) - Searching for Low-Bit Weights in Quantized Neural Networks [129.8319019563356]
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.
We present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately.
arXiv Detail & Related papers (2020-09-18T09:13:26Z) - Post-Training Piecewise Linear Quantization for Deep Neural Networks [13.717228230596167]
Quantization plays an important role in the energy-efficient deployment of deep neural networks on resource-limited devices.
We propose a piecewise linear quantization scheme to enable accurate approximation for tensor values that have bell-shaped distributions with long tails.
Compared to state-of-the-art post-training quantization methods, our proposed method achieves superior performance on image classification, semantic segmentation, and object detection with minor overhead.
arXiv Detail & Related papers (2020-01-31T23:47:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.