Related papers: CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution

CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution

URL: http://arxiv.org/abs/2502.15478v1
Date: Fri, 21 Feb 2025 14:04:30 GMT
Title: CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution
Authors: Kai Liu, Dehui Wang, Zhiteng Li, Zheng Chen, Yong Guo, Wenbo Li, Linghe Kong, Yulun Zhang,
Abstract summary: We propose CondiQuant, a condition number based low-bit post-training quantization for image super-resolution.<n>We show that CondiQuant outperforms existing state-of-the-art post-training quantization methods in accuracy without computation overhead.
Score: 59.91470739501034
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Low-bit model quantization for image super-resolution (SR) is a longstanding task that is renowned for its surprising compression and acceleration ability. However, accuracy degradation is inevitable when compressing the full-precision (FP) model to ultra-low bit widths (2~4 bits). Experimentally, we observe that the degradation of quantization is mainly attributed to the quantization of activation instead of model weights. In numerical analysis, the condition number of weights could measure how much the output value can change for a small change in the input argument, inherently reflecting the quantization error. Therefore, we propose CondiQuant, a condition number based low-bit post-training quantization for image super-resolution. Specifically, we formulate the quantization error as the condition number of weight metrics. By decoupling the representation ability and the quantization sensitivity, we design an efficient proximal gradient descent algorithm to iteratively minimize the condition number and maintain the output still. With comprehensive experiments, we demonstrate that CondiQuant outperforms existing state-of-the-art post-training quantization methods in accuracy without computation overhead and gains the theoretically optimal compression ratio in model parameters. Our code and model are released at https://github.com/Kai-Liu001/CondiQuant.

Related papers

Low-bit Model Quantization for Deep Neural Networks: A Survey [123.89598730307208]
This article surveys the recent five-year progress towards low-bit quantization on deep neural networks (DNNs)<n>We discuss and compare the state-of-the-art quantization methods and classify them into 8 main categories and 24 sub-categories according to their core techniques.<n>We shed light on the potential research opportunities in the field of model quantization.
arXiv Detail & Related papers (2025-05-08T13:26:19Z)
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models [64.84734437930362]
Large Language Models (LLMs) suffer severe performance degradation when facing extremely low-bit (sub 2-bit) quantization.<n>We propose an extremely low-bit PTQ method called PTQ1.61, which enables weight quantization to 1.61-bit for the first time.<n>Experiments indicate our PTQ1.61 achieves state-of-the-art performance in extremely low-bit quantization.
arXiv Detail & Related papers (2025-02-18T08:04:58Z)
DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation [3.78219736760145]
Quantization of diffusion models is a promising way to compress and accelerate models. Existing methods cannot maintain both accuracy and efficiency simultaneously for low-bit quantization. We propose DilateQuant, a novel quantization framework for diffusion models that offers comparable accuracy and high efficiency.
arXiv Detail & Related papers (2024-09-22T04:21:29Z)
ISQuant: apply squant to the real deployment [0.0]
We analyze why the combination of quantization and dequantization is used to train the model. We propose ISQuant as a solution for deploying 8-bit models.
arXiv Detail & Related papers (2024-07-05T15:10:05Z)
2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution [83.09117439860607]
Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment. It is notorious that low-bit quantization degrades the accuracy of SR models compared to their full-precision (FP) counterparts. We present a dual-stage low-bit post-training quantization (PTQ) method for image super-resolution, namely 2DQuant, which achieves efficient and accurate SR under low-bit quantization.
arXiv Detail & Related papers (2024-06-10T06:06:11Z)
OAC: Output-adaptive Calibration for Accurate Post-training Quantization [28.67781845829386]
Post-training Quantization (PTQ) techniques have been developed to compress Large Language Models (LLMs)<n>Most PTQ approaches formulate the quantization error based on a layer-wise Euclidean loss, ignoring the model output.<n>We propose Output-adaptive Quantization (OAC) to incorporate the model output in the calibration process.
arXiv Detail & Related papers (2024-05-23T20:01:17Z)
MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search [7.564770908909927]
Quantization is a technique for creating efficient Deep Neural Networks (DNNs) We propose MixQuant, a search algorithm that finds the optimal custom quantization bit-width for each layer weight based on roundoff error. We show that combining MixQuant with BRECQ, a state-of-the-art quantization method, yields better quantized model accuracy than BRECQ alone.
arXiv Detail & Related papers (2023-09-29T15:49:54Z)
NUPES : Non-Uniform Post-Training Quantization via Power Exponent Search [7.971065005161565]
quantization is a technique to convert floating point representations to low bit-width fixed point representations. We show how to learn new quantized weights over the entire quantized space. We show the ability of the method to achieve state-of-the-art compression rates in both, data-free and data-driven configurations.
arXiv Detail & Related papers (2023-08-10T14:19:58Z)
One Model for All Quantization: A Quantized Network Supporting Hot-Swap Bit-Width Adjustment [36.75157407486302]
We propose a method to train a model for all quantization that supports diverse bit-widths. We use wavelet decomposition and reconstruction to increase the diversity of weights. Our method can achieve accuracy comparable to dedicated models trained at the same precision.
arXiv Detail & Related papers (2021-05-04T08:10:50Z)
Differentiable Model Compression via Pseudo Quantization Noise [99.89011673907814]
We propose to add independent pseudo quantization noise to model parameters during training to approximate the effect of a quantization operator. We experimentally verify that our method outperforms state-of-the-art quantization techniques on several benchmarks and architectures for image classification, language modeling, and audio source separation.
arXiv Detail & Related papers (2021-04-20T14:14:03Z)
Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition [65.7040645560855]
We propose Q-ASR, an integer-only, zero-shot quantization scheme for ASR models. We show negligible WER change as compared to the full-precision baseline models. Q-ASR exhibits a large compression rate of more than 4x with small WER degradation.
arXiv Detail & Related papers (2021-03-31T06:05:40Z)
DAQ: Distribution-Aware Quantization for Deep Image Super-Resolution Networks [49.191062785007006]
Quantizing deep convolutional neural networks for image super-resolution substantially reduces their computational costs. Existing works either suffer from a severe performance drop in ultra-low precision of 4 or lower bit-widths, or require a heavy fine-tuning process to recover the performance. We propose a novel distribution-aware quantization scheme (DAQ) which facilitates accurate training-free quantization in ultra-low precision.
arXiv Detail & Related papers (2020-12-21T10:19:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.