RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
Vision Transformers
- URL: http://arxiv.org/abs/2212.08254v2
- Date: Mon, 7 Aug 2023 03:00:41 GMT
- Title: RepQ-ViT: Scale Reparameterization for Post-Training Quantization of
Vision Transformers
- Authors: Zhikai Li, Junrui Xiao, Lianwei Yang, and Qingyi Gu
- Abstract summary: We propose RepQ-ViT, a novel PTQ framework for vision transformers (ViTs)
RepQ-ViT decouples the quantization and inference processes.
It can outperform existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ of ViTs to a usable level.
- Score: 2.114921680609289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-training quantization (PTQ), which only requires a tiny dataset for
calibration without end-to-end retraining, is a light and practical model
compression technique. Recently, several PTQ schemes for vision transformers
(ViTs) have been presented; unfortunately, they typically suffer from
non-trivial accuracy degradation, especially in low-bit cases. In this paper,
we propose RepQ-ViT, a novel PTQ framework for ViTs based on quantization scale
reparameterization, to address the above issues. RepQ-ViT decouples the
quantization and inference processes, where the former employs complex
quantizers and the latter employs scale-reparameterized simplified quantizers.
This ensures both accurate quantization and efficient inference, which
distinguishes it from existing approaches that sacrifice quantization
performance to meet the target hardware. More specifically, we focus on two
components with extreme distributions: post-LayerNorm activations with severe
inter-channel variation and post-Softmax activations with power-law features,
and initially apply channel-wise quantization and log$\sqrt{2}$ quantization,
respectively. Then, we reparameterize the scales to hardware-friendly
layer-wise quantization and log2 quantization for inference, with only slight
accuracy or computational costs. Extensive experiments are conducted on
multiple vision tasks with different model variants, proving that RepQ-ViT,
without hyperparameters and expensive reconstruction procedures, can outperform
existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ
of ViTs to a usable level. Code is available at
https://github.com/zkkli/RepQ-ViT.
Related papers
- PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [87.89013794655207]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.
We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.
Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z) - DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers [2.0862654518798034]
We propose a Distribution-Friendly and Outlier-Aware Post-training Quantization method for Vision Transformers.
DopQ-ViT analyzes the inefficiencies of current quantizers and introduces a distribution-friendly Tan Quantizer called TanQ.
DopQ-ViT has been extensively validated and significantly improves the performance of quantization models.
arXiv Detail & Related papers (2024-08-06T16:40:04Z) - AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer [54.713778961605115]
Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks in the computer vision community.
We propose a novel non-uniform quantizer, dubbed the Adaptive Logarithm AdaLog (AdaLog) quantizer.
arXiv Detail & Related papers (2024-07-17T18:38:48Z) - ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers [7.155242379236052]
Quantization of Vision Transformers (ViTs) has emerged as a promising solution to mitigate these challenges.
Existing methods still suffer from significant accuracy loss at low-bit.
ADFQ-ViT provides significant improvements over various baselines in image classification, object detection, and instance segmentation tasks at 4-bit.
arXiv Detail & Related papers (2024-07-03T02:41:59Z) - RepQuant: Towards Accurate Post-Training Quantization of Large
Transformer Models via Scale Reparameterization [8.827794405944637]
Post-training quantization (PTQ) is a promising solution for compressing large transformer models.
Existing PTQ methods typically exhibit non-trivial performance loss.
We propose RepQuant, a novel PTQ framework with quantization-inference decoupling paradigm.
arXiv Detail & Related papers (2024-02-08T12:35:41Z) - MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision
Transformer [7.041718444626999]
We propose a mixed-precision post-training quantization framework for vision transformers (MPTQ-ViT)
Our experiments on ViT, DeiT, and Swin demonstrate significant accuracy improvements compared with SOTA on the ImageNet dataset.
arXiv Detail & Related papers (2024-01-26T14:25:15Z) - I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization [49.17407185195788]
We introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an inclusive and stable fashion.
I&S-ViT elevates the performance of 3-bit ViT-B by an impressive 50.68%.
arXiv Detail & Related papers (2023-11-16T13:07:47Z) - PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language
Models [52.09865918265002]
We propose a novel quantize before fine-tuning'' framework, PreQuant.
PreQuant is compatible with various quantization strategies, with outlier-aware fine-tuning incorporated to correct the induced quantization error.
We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5.
arXiv Detail & Related papers (2023-05-30T08:41:33Z) - Towards Accurate Post-Training Quantization for Vision Transformer [48.779346466374406]
Existing post-training quantization methods still cause severe performance drops.
APQ-ViT surpasses the existing post-training quantization methods by convincing margins.
arXiv Detail & Related papers (2023-03-25T03:05:26Z) - NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
for Vision Transformers [53.85087932591237]
NoisyQuant is a quantizer-agnostic enhancement for the post-training activation quantization performance of vision transformers.
Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution.
NoisyQuant largely improves the post-training quantization performance of vision transformer with minimal computation overhead.
arXiv Detail & Related papers (2022-11-29T10:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.