Related papers: Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks

Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks

URL: http://arxiv.org/abs/2412.13553v1
Date: Wed, 18 Dec 2024 07:07:38 GMT
Title: Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks
Authors: Hangming Zhang, Alexander Sboev, Roman Rybka, Qiang Yu,
Abstract summary: Spiking Neural Networks have attracted significant attention in recent years due to their distinctive low-power characteristics.<n>Transformers models, known for their powerful self-attention mechanisms and parallel processing capabilities, have demonstrated exceptional performance across various domains.<n>Despite the significant advantages of both SNNs and Transformers, directly combining the low-power benefits of SNNs with the high performance of Transformers remains challenging.
Score: 44.145870290310356
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Spiking Neural Networks have attracted significant attention in recent years due to their distinctive low-power characteristics. Meanwhile, Transformer models, known for their powerful self-attention mechanisms and parallel processing capabilities, have demonstrated exceptional performance across various domains, including natural language processing and computer vision. Despite the significant advantages of both SNNs and Transformers, directly combining the low-power benefits of SNNs with the high performance of Transformers remains challenging. Specifically, while the sparse computing mode of SNNs contributes to reduced energy consumption, traditional attention mechanisms depend on dense matrix computations and complex softmax operations. This reliance poses significant challenges for effective execution in low-power scenarios. Given the tremendous success of Transformers in deep learning, it is a necessary step to explore the integration of SNNs and Transformers to harness the strengths of both. In this paper, we propose a novel model architecture, Spike Aggregation Transformer (SAFormer), that integrates the low-power characteristics of SNNs with the high-performance advantages of Transformer models. The core contribution of SAFormer lies in the design of the Spike Aggregated Self-Attention (SASA) mechanism, which significantly simplifies the computation process by calculating attention weights using only the spike matrices query and key, thereby effectively reducing energy consumption. Additionally, we introduce a Depthwise Convolution Module (DWC) to enhance the feature extraction capabilities, further improving overall accuracy. We evaluated and demonstrated that SAFormer outperforms state-of-the-art SNNs in both accuracy and energy consumption, highlighting its significant advantages in low-power and high-performance computing.

Related papers

BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN) We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations. Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z)
Spiking Transformer:Introducing Accurate Addition-Only Spiking Self-Attention for Transformer [15.93436166506258]
Spiking Neural Networks have emerged as a promising energy-efficient alternative to traditional Artificial Neural Networks. This paper introduces Accurate Addition-Only Spiking Self-Attention (A$2$OS$2$A)
arXiv Detail & Related papers (2025-02-28T22:23:29Z)
Binary Event-Driven Spiking Transformer [36.815359983551986]
Transformer-based Spiking Neural Networks (SNNs) introduce a novel event-driven self-attention paradigm. We propose the Binary Event-Driven Spiking Transformer, i.e. BESTformer. BESTformer suffers from a severe performance drop from its full-precision counterpart due to the limited representation capability of binarization.
arXiv Detail & Related papers (2025-01-10T12:00:11Z)
SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks [22.665939536001797]
We propose a novel spiking self-attention mechanism named Dual Spike Self-Attention (DSSA) with a reasonable scaling method. Based on DSSA, we propose a novel spiking Vision Transformer architecture called SpikingResformer. We show that SpikingResformer achieves higher accuracy with fewer parameters and lower energy consumption than other spiking Vision Transformer counterparts.
arXiv Detail & Related papers (2024-03-21T11:16:42Z)
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient. We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z)
Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence. We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN) We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z)
AutoST: Training-free Neural Architecture Search for Spiking Transformers [14.791412391584064]
Spiking Transformers achieve both the energy efficiency of Spiking Neural Networks (SNNs) and the high capacity of Transformers. Existing Spiking Transformer architectures exhibit a notable architectural gap, resulting in suboptimal performance. We introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.
arXiv Detail & Related papers (2023-07-01T10:19:52Z)
RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs. We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z)
The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks [0.368986335765876]
quantization and pruning of parameters can both compress the model size, reduce memory footprints, and facilitate low-latency execution. We study various combinations of pruning and quantization in isolation, cumulatively, and simultaneously to a state-of-the-art SNN targeting gesture recognition. We show that this state-of-the-art model is amenable to aggressive parameter quantization, not suffering from any loss in accuracy down to ternary weights.
arXiv Detail & Related papers (2023-02-08T16:25:20Z)
Spikformer: When Spiking Neural Network Meets Transformer [102.91330530210037]
We consider two biologically plausible structures, the Spiking Neural Network (SNN) and the self-attention mechanism. We propose a novel Spiking Self Attention (SSA) as well as a powerful framework, named Spiking Transformer (Spikformer)
arXiv Detail & Related papers (2022-09-29T14:16:49Z)
Cross-receptive Focused Inference Network for Lightweight Image Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks. Transformers that need to incorporate contextual information to extract features dynamically are neglected. We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z)
Efficient pre-training objectives for Transformers [84.64393460397471]
We study several efficient pre-training objectives for Transformers-based models. We prove that eliminating the MASK token and considering the whole output during the loss are essential choices to improve performance.
arXiv Detail & Related papers (2021-04-20T00:09:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.