Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks
- URL: http://arxiv.org/abs/2412.13553v1
- Date: Wed, 18 Dec 2024 07:07:38 GMT
- Title: Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks
- Authors: Hangming Zhang, Alexander Sboev, Roman Rybka, Qiang Yu,
- Abstract summary: Spiking Neural Networks have attracted significant attention in recent years due to their distinctive low-power characteristics.
Transformers models, known for their powerful self-attention mechanisms and parallel processing capabilities, have demonstrated exceptional performance across various domains.
Despite the significant advantages of both SNNs and Transformers, directly combining the low-power benefits of SNNs with the high performance of Transformers remains challenging.
- Score: 44.145870290310356
- License:
- Abstract: Spiking Neural Networks have attracted significant attention in recent years due to their distinctive low-power characteristics. Meanwhile, Transformer models, known for their powerful self-attention mechanisms and parallel processing capabilities, have demonstrated exceptional performance across various domains, including natural language processing and computer vision. Despite the significant advantages of both SNNs and Transformers, directly combining the low-power benefits of SNNs with the high performance of Transformers remains challenging. Specifically, while the sparse computing mode of SNNs contributes to reduced energy consumption, traditional attention mechanisms depend on dense matrix computations and complex softmax operations. This reliance poses significant challenges for effective execution in low-power scenarios. Given the tremendous success of Transformers in deep learning, it is a necessary step to explore the integration of SNNs and Transformers to harness the strengths of both. In this paper, we propose a novel model architecture, Spike Aggregation Transformer (SAFormer), that integrates the low-power characteristics of SNNs with the high-performance advantages of Transformer models. The core contribution of SAFormer lies in the design of the Spike Aggregated Self-Attention (SASA) mechanism, which significantly simplifies the computation process by calculating attention weights using only the spike matrices query and key, thereby effectively reducing energy consumption. Additionally, we introduce a Depthwise Convolution Module (DWC) to enhance the feature extraction capabilities, further improving overall accuracy. We evaluated and demonstrated that SAFormer outperforms state-of-the-art SNNs in both accuracy and energy consumption, highlighting its significant advantages in low-power and high-performance computing.
Related papers
- Binary Event-Driven Spiking Transformer [36.815359983551986]
Transformer-based Spiking Neural Networks (SNNs) introduce a novel event-driven self-attention paradigm.
We propose the Binary Event-Driven Spiking Transformer, i.e. BESTformer.
BESTformer suffers from a severe performance drop from its full-precision counterpart due to the limited representation capability of binarization.
arXiv Detail & Related papers (2025-01-10T12:00:11Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks [22.665939536001797]
We propose a novel spiking self-attention mechanism named Dual Spike Self-Attention (DSSA) with a reasonable scaling method.
Based on DSSA, we propose a novel spiking Vision Transformer architecture called SpikingResformer.
We show that SpikingResformer achieves higher accuracy with fewer parameters and lower energy consumption than other spiking Vision Transformer counterparts.
arXiv Detail & Related papers (2024-03-21T11:16:42Z) - Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence.
We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN)
We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z) - AutoST: Training-free Neural Architecture Search for Spiking
Transformers [14.791412391584064]
Spiking Transformers achieve both the energy efficiency of Spiking Neural Networks (SNNs) and the high capacity of Transformers.
Existing Spiking Transformer architectures exhibit a notable architectural gap, resulting in suboptimal performance.
We introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.
arXiv Detail & Related papers (2023-07-01T10:19:52Z) - Auto-Spikformer: Spikformer Architecture Search [22.332981906087785]
Self-attention mechanisms have been integrated into Spiking Neural Networks (SNNs)
Recent advancements in SNN architecture, such as Spikformer, have demonstrated promising outcomes.
We propose Auto-Spikformer, a one-shot Transformer Architecture Search (TAS) method, which automates the quest for an optimized Spikformer architecture.
arXiv Detail & Related papers (2023-06-01T15:35:26Z) - RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.
We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z) - The Hardware Impact of Quantization and Pruning for Weights in Spiking
Neural Networks [0.368986335765876]
quantization and pruning of parameters can both compress the model size, reduce memory footprints, and facilitate low-latency execution.
We study various combinations of pruning and quantization in isolation, cumulatively, and simultaneously to a state-of-the-art SNN targeting gesture recognition.
We show that this state-of-the-art model is amenable to aggressive parameter quantization, not suffering from any loss in accuracy down to ternary weights.
arXiv Detail & Related papers (2023-02-08T16:25:20Z) - Spikformer: When Spiking Neural Network Meets Transformer [102.91330530210037]
We consider two biologically plausible structures, the Spiking Neural Network (SNN) and the self-attention mechanism.
We propose a novel Spiking Self Attention (SSA) as well as a powerful framework, named Spiking Transformer (Spikformer)
arXiv Detail & Related papers (2022-09-29T14:16:49Z) - Efficient pre-training objectives for Transformers [84.64393460397471]
We study several efficient pre-training objectives for Transformers-based models.
We prove that eliminating the MASK token and considering the whole output during the loss are essential choices to improve performance.
arXiv Detail & Related papers (2021-04-20T00:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.