Related papers: Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

URL: http://arxiv.org/abs/2208.05163v1
Date: Wed, 10 Aug 2022 05:54:46 GMT
Title: Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization
Authors: Zhengang Li, Mengshu Sun, Alec Lu, Haoyu Ma, Geng Yuan, Yanyue Xie, Hao Tang, Yanyu Li, Miriam Leeser, Zhangyang Wang, Xue Lin, Zhenman Fang
Abstract summary: Vision transformers (ViTs) are emerging with significantly improved accuracy in computer vision tasks. This work proposes an FPGA-aware automatic ViT acceleration framework based on the proposed mixed-scheme quantization.
Score: 78.18328503396057
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision transformers (ViTs) are emerging with significantly improved accuracy in computer vision tasks. However, their complex architecture and enormous computation/storage demand impose urgent needs for new hardware accelerator design methodology. This work proposes an FPGA-aware automatic ViT acceleration framework based on the proposed mixed-scheme quantization. To the best of our knowledge, this is the first FPGA-based ViT acceleration framework exploring model quantization. Compared with state-of-the-art ViT quantization work (algorithmic approach only without hardware acceleration), our quantization achieves 0.47% to 1.36% higher Top-1 accuracy under the same bit-width. Compared with the 32-bit floating-point baseline FPGA accelerator, our accelerator achieves around 5.6x improvement on the frame rate (i.e., 56.8 FPS vs. 10.0 FPS) with 0.71% accuracy drop on ImageNet dataset for DeiT-base.

Related papers

FPQVAR: Floating Point Quantization for Visual Autoregressive Model with FPGA Hardware Co-design [5.4815337424005355]
Visual autoregressive ( VAR) modeling has marked a paradigm shift in image generation from next-token prediction to next-scale prediction.<n>To reduce the memory and computation cost, we propose FPQvar, an efficient post-training floating-point (FP) quantization framework for VAR.<n>Our accelerator on AMD-Xilinx VCK190 FPGA achieves a throughput of 1.1 image/s, which is 3.1x higher than the integer-based accelerator.
arXiv Detail & Related papers (2025-05-22T07:47:51Z)
Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers [56.37495946212932]
Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs) This work proposes Quasar-ViT, a hardware-oriented quantization-aware architecture search framework for ViTs.
arXiv Detail & Related papers (2024-07-25T16:35:46Z)
An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT [5.141764719319689]
We propose an FPGA-based accelerator for EfficientViT to advance the hardware efficiency frontier of ViTs. Specifically, we design a reconfigurable architecture to efficiently support various operation types, including lightweight convolutions and attention. Experimental results show that our accelerator achieves up to 780.2 GOPS in throughput and 105.1 GOPS/W in energy efficiency at 200MHz.
arXiv Detail & Related papers (2024-03-29T15:20:33Z)
TurboViT: Generating Fast Vision Transformers via Generative Architecture Search [74.24393546346974]
Vision transformers have shown unprecedented levels of performance in tackling various visual perception tasks in recent years. There has been significant research recently on the design of efficient vision transformer architecture. In this study, we explore the generation of fast vision transformer architecture designs via generative architecture search.
arXiv Detail & Related papers (2023-08-22T13:08:29Z)
Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking [69.89887818921825]
HiT is a new family of efficient tracking models that can run at high speed on different devices. HiT achieves 64.6% AUC on the LaSOT benchmark, surpassing all previous efficient trackers.
arXiv Detail & Related papers (2023-08-14T02:51:34Z)
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts [60.1586169973792]
M$3$ViT is the latest multi-task ViT model that introduces mixture-of-experts (MoE) MoE achieves better accuracy and over 80% reduction computation but leaves challenges for efficient deployment on FPGA. Our work, dubbed Edge-MoE, solves the challenges to introduce the first end-to-end FPGA accelerator for multi-task ViT with a collection of architectural innovations.
arXiv Detail & Related papers (2023-05-30T02:24:03Z)
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers [35.92244135055901]
HeatViT is an image-adaptive token pruning framework for vision transformers (ViTs) on embedded FPGAs. HeatViT can achieve 0.7%$sim$8.9% higher accuracy compared to existing ViT pruning studies. HeatViT can achieve more than 28.4%$sim computation reduction, for various widely used ViTs.
arXiv Detail & Related papers (2022-11-15T13:00:43Z)
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design [42.46121663652989]
Vision Transformers (ViTs) have achieved state-of-the-art performance on various vision tasks. However, ViTs' self-attention module is still arguably a major bottleneck. We propose a dedicated algorithm and accelerator co-design framework dubbed ViTCoD for accelerating ViTs.
arXiv Detail & Related papers (2022-10-18T04:07:23Z)
VAQF: Fully Automatic Software-hardware Co-design Framework for Low-bit Vision Transformer [121.85581713299918]
We propose VAQF, a framework that builds inference accelerators on FPGA platforms for quantized Vision Transformers (ViTs) Given the model structure and the desired frame rate, VAQF will automatically output the required quantization precision for activations. This is the first time quantization has been incorporated into ViT acceleration on FPGAs.
arXiv Detail & Related papers (2022-01-17T20:27:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.