Spikformer: When Spiking Neural Network Meets Transformer
- URL: http://arxiv.org/abs/2209.15425v1
- Date: Thu, 29 Sep 2022 14:16:49 GMT
- Title: Spikformer: When Spiking Neural Network Meets Transformer
- Authors: Zhaokun Zhou, Yuesheng Zhu, Chao He, Yaowei Wang, Shuicheng Yan,
Yonghong Tian, Li Yuan
- Abstract summary: We consider two biologically plausible structures, the Spiking Neural Network (SNN) and the self-attention mechanism.
We propose a novel Spiking Self Attention (SSA) as well as a powerful framework, named Spiking Transformer (Spikformer)
- Score: 102.91330530210037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider two biologically plausible structures, the Spiking Neural Network
(SNN) and the self-attention mechanism. The former offers an energy-efficient
and event-driven paradigm for deep learning, while the latter has the ability
to capture feature dependencies, enabling Transformer to achieve good
performance. It is intuitively promising to explore the marriage between them.
In this paper, we consider leveraging both self-attention capability and
biological properties of SNNs, and propose a novel Spiking Self Attention (SSA)
as well as a powerful framework, named Spiking Transformer (Spikformer). The
SSA mechanism in Spikformer models the sparse visual feature by using
spike-form Query, Key, and Value without softmax. Since its computation is
sparse and avoids multiplication, SSA is efficient and has low computational
energy consumption. It is shown that Spikformer with SSA can outperform the
state-of-the-art SNNs-like frameworks in image classification on both
neuromorphic and static datasets. Spikformer (66.3M parameters) with comparable
size to SEW-ResNet-152 (60.2M,69.26%) can achieve 74.81% top1 accuracy on
ImageNet using 4 time steps, which is the state-of-the-art in directly trained
SNNs models.
Related papers
- SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks [22.665939536001797]
We propose a novel spiking self-attention mechanism named Dual Spike Self-Attention (DSSA) with a reasonable scaling method.
Based on DSSA, we propose a novel spiking Vision Transformer architecture called SpikingResformer.
We show that SpikingResformer achieves higher accuracy with fewer parameters and lower energy consumption than other spiking Vision Transformer counterparts.
arXiv Detail & Related papers (2024-03-21T11:16:42Z) - Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN
Ticket [81.9471033944819]
Spiking Neural Networks (SNNs) face the challenge of limited performance.
Self-attention mechanism, which is the cornerstone of the high-performance Transformer, is absent in existing SNNs.
We propose a novel Spiking Self-Attention (SSA) and Spiking Transformer (Spikformer)
arXiv Detail & Related papers (2024-01-04T01:33:33Z) - SpikingJelly: An open-source machine learning infrastructure platform
for spike-based intelligence [51.6943465041708]
Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency.
We contribute a full-stack toolkit for pre-processing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips.
arXiv Detail & Related papers (2023-10-25T13:15:17Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Attention-free Spikformer: Mixing Spike Sequences with Simple Linear
Transforms [16.54314950692779]
Spikformer integrates self-attention capability and the biological properties of Spiking Neural Networks (SNNs)
It introduces a Spiking Self-Attention (SSA) module to mix sparse visual features using spike-form Query, Key, and Value.
We conduct extensive experiments on image classification using both neuromorphic and static datasets.
arXiv Detail & Related papers (2023-08-02T11:41:54Z) - Systematic Architectural Design of Scale Transformed Attention Condenser
DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim)
We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy.
Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z) - Auto-Spikformer: Spikformer Architecture Search [22.332981906087785]
Self-attention mechanisms have been integrated into Spiking Neural Networks (SNNs)
Recent advancements in SNN architecture, such as Spikformer, have demonstrated promising outcomes.
We propose Auto-Spikformer, a one-shot Transformer Architecture Search (TAS) method, which automates the quest for an optimized Spikformer architecture.
arXiv Detail & Related papers (2023-06-01T15:35:26Z) - Spikingformer: Spike-driven Residual Learning for Transformer-based
Spiking Neural Network [19.932683405796126]
Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks.
SNNs suffer from non-spike computations caused by the structure of their residual connection.
We develop Spikingformer, a pure transformer-based spiking neural network.
arXiv Detail & Related papers (2023-04-24T09:44:24Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.