Fugu-MT 論文翻訳(概要): VAQF: Fully Automatic Software-hardware Co-design Framework for Low-bit Vision Transformer

論文の概要: VAQF: Fully Automatic Software-hardware Co-design Framework for Low-bit Vision Transformer

arxiv url: http://arxiv.org/abs/2201.06618v1
Date: Mon, 17 Jan 2022 20:27:52 GMT
ステータス: 翻訳完了
システム内更新日: 2022-01-19 14:24:22.031886
Title: VAQF: Fully Automatic Software-hardware Co-design Framework for Low-bit Vision Transformer
Title（参考訳）: vaqf:低ビットビジョントランスフォーマーのための完全自動ソフトウェアハードウェア共同設計フレームワーク
Authors: Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang
Abstract要約: 量子化ビジョントランス(ViT)のためのFPGAプラットフォーム上で推論アクセラレータを構築するフレームワークVAQFを提案する。モデル構造と所望のフレームレートから、VAQFはアクティベーションに必要な量子化精度を自動的に出力する。 FPGA上でのViTアクセラレーションに量子化が組み込まれたのはこれが初めてである。
参考スコア（独自算出の注目度）: 121.85581713299918
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The transformer architectures with attention mechanisms have obtained success in Nature Language Processing (NLP), and Vision Transformers (ViTs) have recently extended the application domains to various vision tasks. While achieving high performance, ViTs suffer from large model size and high computation complexity that hinders the deployment of them on edge devices. To achieve high throughput on hardware and preserve the model accuracy simultaneously, we propose VAQF, a framework that builds inference accelerators on FPGA platforms for quantized ViTs with binary weights and low-precision activations. Given the model structure and the desired frame rate, VAQF will automatically output the required quantization precision for activations as well as the optimized parameter settings of the accelerator that fulfill the hardware requirements. The implementations are developed with Vivado High-Level Synthesis (HLS) on the Xilinx ZCU102 FPGA board, and the evaluation results with the DeiT-base model indicate that a frame rate requirement of 24 frames per second (FPS) is satisfied with 8-bit activation quantization, and a target of 30 FPS is met with 6-bit activation quantization. To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate. Very small compilation time cost is incurred compared with quantization training, and the generated accelerators show the capability of achieving real-time execution for state-of-the-art ViT models on FPGAs.
Abstract（参考訳）: 注意機構を備えたトランスフォーマーアーキテクチャはNLP(Nature Language Processing)で成功し、ViT(Vision Transformer)はアプリケーションドメインを様々な視覚タスクに拡張した。高性能を実現する一方で、ViTは大きなモデルサイズと高い計算複雑性に悩まされ、エッジデバイスへのデプロイを妨げている。ハードウェア上で高いスループットを実現し,モデル精度を同時に維持するために,二値重み付き量子化ViTのためのFPGAプラットフォーム上で推論アクセラレータを構築するVAQFを提案する。モデル構造と所望のフレームレートを考慮すれば、vaqfはハードウェア要件を満たすアクセラレータの最適化パラメータ設定に加えて、アクティベーションに必要な量子化精度を自動的に出力します。実装は、xilinx zcu102 fpga基板上でvivado high-level synthesis (hls) を用いて開発され、deit-baseモデルによる評価結果から、24フレーム/秒(fps)のフレームレート要件が8ビットのアクティベーション量子化で満たされ、30fpsのターゲットが6ビットのアクティベーション量子化を満足していることが示されている。我々の知る限りでは、ソフトウェア側の量子化戦略とハードウェア側のアクセラレータ実装を目標フレームレートでガイドする完全に自動化されたフレームワークの助けを借りて、FPGA上でのVTアクセラレーションに量子化が組み込まれたのはこれが初めてである。量子化トレーニングに比べてコンパイル時間コストが非常に小さく、生成された加速器はfpga上の最先端vitモデルのリアルタイム実行を実現する能力を示している。

論文の概要: VAQF: Fully Automatic Software-hardware Co-design Framework for Low-bit Vision Transformer

関連論文リスト