TransAxx: Efficient Transformers with Approximate Computing
- URL: http://arxiv.org/abs/2402.07545v1
- Date: Mon, 12 Feb 2024 10:16:05 GMT
- Title: TransAxx: Efficient Transformers with Approximate Computing
- Authors: Dimitrios Danopoulos, Georgios Zervakis, Dimitrios Soudris, J\"org
Henkel
- Abstract summary: Vision Transformer (ViT) models have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks (CNNs)
We propose TransAxx, a framework based on the popular PyTorch library that enables fast inherent support for approximate arithmetic.
Our approach uses a Monte Carlo Tree Search (MCTS) algorithm to efficiently search the space of possible configurations.
- Score: 4.347898144642257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision Transformer (ViT) models which were recently introduced by the
transformer architecture have shown to be very competitive and often become a
popular alternative to Convolutional Neural Networks (CNNs). However, the high
computational requirements of these models limit their practical applicability
especially on low-power devices. Current state-of-the-art employs approximate
multipliers to address the highly increased compute demands of DNN accelerators
but no prior research has explored their use on ViT models. In this work we
propose TransAxx, a framework based on the popular PyTorch library that enables
fast inherent support for approximate arithmetic to seamlessly evaluate the
impact of approximate computing on DNNs such as ViT models. Using TransAxx we
analyze the sensitivity of transformer models on the ImageNet dataset to
approximate multiplications and perform approximate-aware finetuning to regain
accuracy. Furthermore, we propose a methodology to generate approximate
accelerators for ViT models. Our approach uses a Monte Carlo Tree Search (MCTS)
algorithm to efficiently search the space of possible configurations using a
hardware-driven hand-crafted policy. Our evaluation demonstrates the efficacy
of our methodology in achieving significant trade-offs between accuracy and
power, resulting in substantial gains without compromising on performance.
Related papers
- Causal Transformer for Fusion and Pose Estimation in Deep Visual Inertial Odometry [1.2289361708127877]
We propose a causal visual-inertial fusion transformer (VIFT) for pose estimation in deep visual-inertial odometry.
The proposed method is end-to-end trainable and requires only a monocular camera and IMU during inference.
arXiv Detail & Related papers (2024-09-13T12:21:25Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Exploring the Performance and Efficiency of Transformer Models for NLP
on Mobile Devices [3.809702129519641]
New deep neural network (DNN) architectures and approaches are emerging every few years, driving the field's advancement.
Transformers are a relatively new model family that has achieved new levels of accuracy across AI tasks, but poses significant computational challenges.
This work aims to make steps towards bridging this gap by examining the current state of Transformers' on-device execution.
arXiv Detail & Related papers (2023-06-20T10:15:01Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - Efficient Vision Transformers via Fine-Grained Manifold Distillation [96.50513363752836]
Vision transformer architectures have shown extraordinary performance on many computer vision tasks.
Although the network performance is boosted, transformers are often required more computational resources.
We propose to excavate useful information from the teacher transformer through the relationship between images and the divided patches.
arXiv Detail & Related papers (2021-07-03T08:28:34Z) - Visformer: The Vision-friendly Transformer [105.52122194322592]
We propose a new architecture named Visformer, which is abbreviated from the Vision-friendly Transformer'
With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy.
arXiv Detail & Related papers (2021-04-26T13:13:03Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z) - TransMOT: Spatial-Temporal Graph Transformer for Multiple Object
Tracking [74.82415271960315]
We propose a solution named TransMOT to efficiently model the spatial and temporal interactions among objects in a video.
TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy.
The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20.
arXiv Detail & Related papers (2021-04-01T01:49:05Z) - Developing Real-time Streaming Transformer Transducer for Speech
Recognition on Large-scale Dataset [37.619200507404145]
Transformer Transducer (T-T) models for the fist pass decoding with low latency and fast speed on a large-scale dataset.
We combine the idea of Transformer-XL and chunk-wise streaming processing to design a streamable Transformer Transducer model.
We demonstrate that T-T outperforms the hybrid model, RNN Transducer (RNN-T), and streamable Transformer attention-based encoder-decoder model in the streaming scenario.
arXiv Detail & Related papers (2020-10-22T03:01:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.