Related papers: Faster Attention Is What You Need: A Fast Self-Attention Neural Network Backbone Architecture for the Edge via Double-Condensing Attention Condensers

Faster Attention Is What You Need: A Fast Self-Attention Neural Network Backbone Architecture for the Edge via Double-Condensing Attention Condensers

URL: http://arxiv.org/abs/2208.06980v1
Date: Mon, 15 Aug 2022 02:47:33 GMT
Title: Faster Attention Is What You Need: A Fast Self-Attention Neural Network Backbone Architecture for the Edge via Double-Condensing Attention Condensers
Authors: Alexander Wong, Mohammad Javad Shafiee, Saad Abbasi, Saeejith Nair, and Mahmoud Famouri
Abstract summary: We introduce a new faster attention condenser design called double-condensing attention condensers. The resulting backbone (which we name AttendNeXt) achieves significantly higher inference throughput on an embedded ARM processor. These promising results demonstrate that exploring different efficient architecture designs and self-attention mechanisms can lead to interesting new building blocks for TinyML applications.
Score: 71.40595908386477
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the growing adoption of deep learning for on-device TinyML applications, there has been an ever-increasing demand for more efficient neural network backbones optimized for the edge. Recently, the introduction of attention condenser networks have resulted in low-footprint, highly-efficient, self-attention neural networks that strike a strong balance between accuracy and speed. In this study, we introduce a new faster attention condenser design called double-condensing attention condensers that enable more condensed feature embedding. We further employ a machine-driven design exploration strategy that imposes best practices design constraints for greater efficiency and robustness to produce the macro-micro architecture constructs of the backbone. The resulting backbone (which we name AttendNeXt) achieves significantly higher inference throughput on an embedded ARM processor when compared to several other state-of-the-art efficient backbones (>10X faster than FB-Net C at higher accuracy and speed) while having a small model size (>1.47X smaller than OFA-62 at higher speed and similar accuracy) and strong accuracy (1.1% higher top-1 accuracy than MobileViT XS on ImageNet at higher speed). These promising results demonstrate that exploring different efficient architecture designs and self-attention mechanisms can lead to interesting new building blocks for TinyML applications.

Related papers

UniForm: A Reuse Attention Mechanism Optimized for Efficient Vision Transformers on Edge Devices [1.795366746592388]
Transformer-based architectures have demonstrated remarkable success across various domains, but their deployment on edge devices remains challenging. We introduce a novel Reuse Attention mechanism, tailored for efficient memory access and computational optimization.
arXiv Detail & Related papers (2024-12-03T10:04:15Z)
LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones [10.435069781620957]
Research in efficient vision backbones is evolving into models that are a mixture of convolutions and transformer blocks. We analyze common modules and architectural design choices for backbones not in terms of MACs, but rather in actual throughput and latency. We combine both macro and micro design to create a new family of hardware-efficient backbone networks called LowFormer.
arXiv Detail & Related papers (2024-09-05T12:18:32Z)
Systematic Architectural Design of Scale Transformed Attention Condenser DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim) We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy. Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z)
Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge [80.88063189896718]
High architectural and computational complexity can result in poor suitability for deployment on embedded devices. Fast GraspNeXt is a fast self-attention neural network architecture tailored for embedded multi-task learning in computer vision tasks for robotic grasping.
arXiv Detail & Related papers (2023-04-21T18:07:14Z)
AttendSeg: A Tiny Attention Condenser Neural Network for Semantic Segmentation on the Edge [71.80459780697956]
We introduce textbfAttendSeg, a low-precision, highly compact deep neural network tailored for on-device semantic segmentation. AttendSeg possesses a self-attention network architecture comprising of light-weight attention condensers for improved spatial-channel selective attention.
arXiv Detail & Related papers (2021-04-29T19:19:04Z)
FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks. Current networks often occupy large number of parameters and require heavy computation costs. Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z)
AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers [81.17461895644003]
We introduce AttendNets, low-precision, highly compact deep neural networks tailored for on-device image recognition. AttendNets possess deep self-attention architectures based on visual attention condensers. Results show AttendNets have significantly lower architectural and computational complexity when compared to several deep neural networks.
arXiv Detail & Related papers (2020-09-30T01:53:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.