Fugu-MT 論文翻訳(概要): ScalePredictor: Instance-aware Scale Learning for Accurate Quantization of Vision Transformers

論文の概要: ScalePredictor: Instance-aware Scale Learning for Accurate Quantization of Vision Transformers

arxiv url: http://arxiv.org/abs/2606.21947v1
Date: Sat, 20 Jun 2026 08:33:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-26 21:51:55.989995
Title: ScalePredictor: Instance-aware Scale Learning for Accurate Quantization of Vision Transformers
Title（参考訳）: ScalePredictor: 視覚変換器の正確な量子化のためのインスタンス対応のスケールラーニング
Authors: Changjun Li, Runqing Jiang, Lian Xu, Ye Zhang, Qingyong Hu, Yulan Guo,
Abstract要約: トレーニング後の量子化(PTQ)は、最小限のトレーニングオーバーヘッドを持つ小さなキャリブレーションセットを使用してモデルを圧縮することで、魅力的なソリューションを提供する。既存のPTQ作業の多くは、全てのインスタンスに一様に適用される静的量子化パラダイムを採用している。本稿では,ViTの高精度かつ効率的な量子化スケール学習のための動的量子化フレームワークであるScalePredictorを提案する。
参考スコア（独自算出の注目度）: 60.09590321091875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision Transformers have achieved remarkable success in many fields, yet their deployment on edge devices remains challenging due to their substantial computational demands. Post-Training Quantization (PTQ) offers an attractive solution by compressing models using a small calibration set with minimal training overhead. However, most existing PTQ works adopt a static quantization paradigm that is uniformly applied to all instances. Given the substantial diversity of natural images, the activation distributions vary significantly across samples, making these methods inherently suboptimal. In this paper, we propose ScalePredictor, a dynamic quantization framework for accurate and efficient quantization scale learning of ViTs. We first reveal a hidden correlation between the distribution range of shallow-layer activations and the optimal scales of deeper layers. Based on this, we develop a scale learning mechanism that integrates an efficient range extraction approach to capture robust range statistics at the shallow stage, which are then fed into a Taylor-motivated polynomial scale projection module to generate all quantization scales simultaneously. With the efficiency of polynomial approximation, ScalePredictor introduces insignificant computational overhead while avoiding costly just-in-time calibration. Extensive experiments on ImageNet demonstrate that ScalePredictor consistently outperforms prior PTQ methods, achieving a more favorable accuracy-efficiency trade-off. Code and additional results are shown in the supplementary materials.
Abstract（参考訳）: ビジョントランスフォーマーは多くの分野において顕著な成功を収めてきたが、エッジデバイスへのデプロイメントは、その相当な計算要求のため、依然として困難である。トレーニング後の量子化(PTQ)は、最小限のトレーニングオーバーヘッドを持つ小さなキャリブレーションセットを使用してモデルを圧縮することで、魅力的なソリューションを提供する。しかし、既存のPTQ作業の多くは、全てのインスタンスに一様に適用される静的量子化パラダイムを採用している。自然画像のかなりの多様性を考えると、活性化分布はサンプルによって大きく異なり、これらの手法は本質的に準最適である。本稿では,ViTの高精度かつ効率的な量子化スケール学習のための動的量子化フレームワークであるScalePredictorを提案する。まず,浅層活性化の分布範囲と深層の最適スケールとの間に隠れた相関関係を明らかにする。そこで我々は,浅層域におけるロバストレンジ統計を捉えるために,効率的なレンジ抽出手法を統合した尺度学習機構を開発し,テイラー動機の多項式スケール投影モジュールに入力し,全ての量子化スケールを同時に生成する。多項式近似の効率性により、ScalePredictorは、コストのかかるジャスト・イン・タイムのキャリブレーションを回避しつつ、重要な計算オーバーヘッドを導入する。 ImageNetの大規模な実験では、ScalePredictorがPTQメソッドよりも一貫して優れており、より良好な精度と効率のトレードオフを実現している。追加資料には、コードと追加結果が記載されている。

論文の概要: ScalePredictor: Instance-aware Scale Learning for Accurate Quantization of Vision Transformers

関連論文リスト