Fugu-MT 論文翻訳(概要): Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

論文の概要: Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

arxiv url: http://arxiv.org/abs/2606.04373v2
Date: Fri, 05 Jun 2026 08:17:28 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.306551
Title: Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers
Title（参考訳）: 切り離された非形式領域の選択的結合:視覚変換器のデータフリー量子化のためのマスク付きアテンションアライメント
Authors: Biao Qian, Yang Wang, Yong Wu, Jungong Han,
Abstract要約: Data-Free Quantization (DFQ)は、サンプルを合成することで、実際のデータにアクセスすることなく、データセキュリティ上の問題に対処する。従来の DFQ Arts for Vision Transformers (ViTs) は、しばしば、合成サンプルと量子化モデルQで期待される入力分布の分布ミスマッチに悩まされる。本研究では,データ自由量子化のためのMaskAQという新しいMasked Attention Alignmentアプローチを提案する。
参考スコア（独自算出の注目度）: 56.376795859825705
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data-Free Quantization (DFQ) addresses data security concerns by synthesizing samples, without accessing real data. It has garnered increasing attention in the context of Vision Transformers (ViTs), owing to the superiority of the self-attention mechanism compared to classical convolutional operation. However, previous DFQ arts for ViTs often suffer from a distribution mismatch between synthetic samples and input distribution expected by quantized models Q, resulting in the suboptimal performance. In this paper, we propose a novel Masked Attention Alignment approach for Data-Free Quantization of ViTs, named MaskAQ, revealing that: 1) the semantics in the self-attention mechanism is predominantly localized to a sparse subset of patches, called informative regions; 2) the informative regions dominate the mutual information between synthetic samples and Q's outputs. To these ends, we incorporate differential entropy maximum over patch similarity of synthetic samples, to decouple informative regions from noisy background. To couple with varied Q, the informative regions are selected to align full-precision models with Q via a masked attention alignment objective, thus yielding high-quality synthetic samples. Furthermore, a periodic sample refreshing strategy comes up to endow MaskAQ with the capacity to continually adapt to the evolving state of Q throughout the training process, to preserve desirable mutual information with synthetic samples. Extensive experiments verify the merits of MaskAQ over state-of-the-art approaches across multiple backbones and downstream tasks. Our code is available at https://github.com/hfutqian/MaskAQ.
Abstract（参考訳）: Data-Free Quantization (DFQ)は、サンプルを合成することで、実際のデータにアクセスすることなく、データセキュリティ上の問題に対処する。視覚変換器(ViT)の文脈では、古典的畳み込み操作に比べて自己注意機構が優れているため、注目度が高まっている。しかし、従来のDFQアートは、合成サンプルと量子化モデルQが期待する入力分布との分布ミスマッチに悩まされ、その結果、準最適性能が得られる。本稿では,データ自由量子化のための新しいMasked Attention AlignmentアプローチであるMaskAQを提案する。 1) 自己認識機構のセマンティクスは,主に情報領域という,パッチのまばらな部分集合に局所化される。 2) 情報領域は, 合成試料とQの出力の相互情報を支配している。これらの目的のために,合成試料のパッチ類似性に対する差分エントロピー最大値を導入し,ノイズ背景から情報領域を分離する。各種Qと対応付けるため、全精度モデルとQをマスクした注目アライメント目標により整列させるため、高品質な合成サンプルを得る。さらに、定期的なサンプルリフレッシュ戦略が登場し、トレーニングプロセスを通してQの進化状態に継続的に適応し、合成サンプルとの望ましい相互情報を維持する能力を有するMaskAQが提供される。大規模な実験は、複数のバックボーンと下流タスクにまたがる最先端アプローチに対するMaskAQの利点を検証する。私たちのコードはhttps://github.com/hfutqian/MaskAQ.comで公開されています。

論文の概要: Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

関連論文リスト