RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
Vision Transformers
- URL: http://arxiv.org/abs/2212.08254v2
- Date: Mon, 7 Aug 2023 03:00:41 GMT
- Title: RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
Vision Transformers
- Authors: Zhikai Li, Junrui Xiao, Lianwei Yang, and Qingyi Gu
- Abstract summary: We propose RepQ-ViT, a novel PTQ framework for vision transformers (ViTs)
RepQ-ViT decouples the quantization and inference processes.
It can outperform existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ of ViTs to a usable level.
- Score: 2.114921680609289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-training quantization (PTQ), which only requires a tiny dataset for
calibration without end-to-end retraining, is a light and practical model
compression technique. Recently, several PTQ schemes for vision transformers
(ViTs) have been presented; unfortunately, they typically suffer from
non-trivial accuracy degradation, especially in low-bit cases. In this paper,
we propose RepQ-ViT, a novel PTQ framework for ViTs based on quantization scale
reparameterization, to address the above issues. RepQ-ViT decouples the
quantization and inference processes, where the former employs complex
quantizers and the latter employs scale-reparameterized simplified quantizers.
This ensures both accurate quantization and efficient inference, which
distinguishes it from existing approaches that sacrifice quantization
performance to meet the target hardware. More specifically, we focus on two
components with extreme distributions: post-LayerNorm activations with severe
inter-channel variation and post-Softmax activations with power-law features,
and initially apply channel-wise quantization and log$\sqrt{2}$ quantization,
respectively. Then, we reparameterize the scales to hardware-friendly
layer-wise quantization and log2 quantization for inference, with only slight
accuracy or computational costs. Extensive experiments are conducted on
multiple vision tasks with different model variants, proving that RepQ-ViT,
without hyperparameters and expensive reconstruction procedures, can outperform
existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ
of ViTs to a usable level. Code is available at
https://github.com/zkkli/RepQ-ViT.
Related papers
- AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers [42.535119270045605]
Post-training quantization (PTQ) has emerged as a promising solution for reducing the storage and computational cost of vision transformers (ViTs)
This paper proposes an innovative PTQ method tailored for ViTs, termed AIQViT (Architecture-Informed Post-training Quantization for ViTs)
arXiv Detail & Related papers (2025-02-07T03:04:50Z) - PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [95.98801201266099]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.
We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.
Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z) - AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer [54.713778961605115]
Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks in the computer vision community.
We propose a novel non-uniform quantizer, dubbed the Adaptive Logarithm AdaLog (AdaLog) quantizer.
arXiv Detail & Related papers (2024-07-17T18:38:48Z) - Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction [48.740630807085566]
Post-training quantization (PTQ) for vision transformers (ViTs) has received increasing attention from both academic and industrial communities.
Current methods fail to account for the complex interactions between quantized weights and activations, resulting in significant quantization errors and suboptimal performance.
This paper presents ERQ, an innovative two-step PTQ method specifically crafted to reduce quantization errors arising from activation and weight quantization sequentially.
arXiv Detail & Related papers (2024-07-09T12:06:03Z) - ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers [7.155242379236052]
Quantization of Vision Transformers (ViTs) has emerged as a promising solution to mitigate these challenges.
Existing methods still suffer from significant accuracy loss at low-bit.
ADFQ-ViT provides significant improvements over various baselines in image classification, object detection, and instance segmentation tasks at 4-bit.
arXiv Detail & Related papers (2024-07-03T02:41:59Z) - RepQuant: Towards Accurate Post-Training Quantization of Large
Transformer Models via Scale Reparameterization [8.827794405944637]
Post-training quantization (PTQ) is a promising solution for compressing large transformer models.
Existing PTQ methods typically exhibit non-trivial performance loss.
We propose RepQuant, a novel PTQ framework with quantization-inference decoupling paradigm.
arXiv Detail & Related papers (2024-02-08T12:35:41Z) - MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision
Transformer [7.041718444626999]
We propose a mixed-precision post-training quantization framework for vision transformers (MPTQ-ViT)
Our experiments on ViT, DeiT, and Swin demonstrate significant accuracy improvements compared with SOTA on the ImageNet dataset.
arXiv Detail & Related papers (2024-01-26T14:25:15Z) - I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization [49.17407185195788]
We introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an inclusive and stable fashion.
I&S-ViT elevates the performance of 3-bit ViT-B by an impressive 50.68%.
arXiv Detail & Related papers (2023-11-16T13:07:47Z) - Towards Accurate Post-Training Quantization for Vision Transformer [48.779346466374406]
Existing post-training quantization methods still cause severe performance drops.
APQ-ViT surpasses the existing post-training quantization methods by convincing margins.
arXiv Detail & Related papers (2023-03-25T03:05:26Z) - NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
for Vision Transformers [53.85087932591237]
NoisyQuant is a quantizer-agnostic enhancement for the post-training activation quantization performance of vision transformers.
Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution.
NoisyQuant largely improves the post-training quantization performance of vision transformer with minimal computation overhead.
arXiv Detail & Related papers (2022-11-29T10:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.