Spiking Vision Transformer with Saccadic Attention
- URL: http://arxiv.org/abs/2502.12677v1
- Date: Tue, 18 Feb 2025 09:32:29 GMT
- Title: Spiking Vision Transformer with Saccadic Attention
- Authors: Shuai Wang, Malu Zhang, Dehao Zhang, Ammar Belatreche, Yichen Xiao, Yu Liang, Yimeng Shan, Qian Sun, Enqi Zhang, Yang Yang,
- Abstract summary: Spiking Neural Networks (SNNs) and Vision Transformers (NNTs) hold potential for achieving both energy efficiency and high performance.<n>We first analyze why SNN-based ViTs suffer from limited performance and identify a mismatch between vanilla selfattention mechanism and temporal spike trains.<n>We introduce an innovative Saccadic Spike Self-Attention (SSSA) method to address these issues.<n>Building on the SSSA mechanism, we develop a SNN-based Vision Transformer (SNN-ViT)
- Score: 14.447083381772375
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The combination of Spiking Neural Networks (SNNs) and Vision Transformers (ViTs) holds potential for achieving both energy efficiency and high performance, particularly suitable for edge vision applications. However, a significant performance gap still exists between SNN-based ViTs and their ANN counterparts. Here, we first analyze why SNN-based ViTs suffer from limited performance and identify a mismatch between the vanilla self-attention mechanism and spatio-temporal spike trains. This mismatch results in degraded spatial relevance and limited temporal interactions. To address these issues, we draw inspiration from biological saccadic attention mechanisms and introduce an innovative Saccadic Spike Self-Attention (SSSA) method. Specifically, in the spatial domain, SSSA employs a novel spike distribution-based method to effectively assess the relevance between Query and Key pairs in SNN-based ViTs. Temporally, SSSA employs a saccadic interaction module that dynamically focuses on selected visual areas at each timestep and significantly enhances whole scene understanding through temporal interactions. Building on the SSSA mechanism, we develop a SNN-based Vision Transformer (SNN-ViT). Extensive experiments across various visual tasks demonstrate that SNN-ViT achieves state-of-the-art performance with linear computational complexity. The effectiveness and efficiency of the SNN-ViT highlight its potential for power-critical edge vision applications.
Related papers
- Spiking Meets Attention: Efficient Remote Sensing Image Super-Resolution with Attention Spiking Neural Networks [57.17129753411926]
Spiking neural networks (SNNs) are emerging as a promising alternative to traditional artificial neural networks (ANNs)
We propose SpikeSR, which achieves state-of-the-art performance across various remote sensing benchmarks such as AID, DOTA, and DIOR.
arXiv Detail & Related papers (2025-03-06T09:06:06Z) - DRiVE: Dynamic Recognition in VEhicles using snnTorch [0.0]
Spiking Neural Networks (SNNs) mimic biological brain activity, processing data efficiently through an event-driven design.
This study combines SNNs with PyTorch's adaptable framework, snnTorch, to test their potential for image-based tasks.
We introduce DRiVE, a vehicle detection model that uses spiking neuron dynamics to classify images, achieving 94.8% accuracy and a near-perfect 0.99 AUC score.
arXiv Detail & Related papers (2025-02-04T11:01:13Z) - Efficient and Effective Time-Series Forecasting with Spiking Neural Networks [47.371024581669516]
Spiking neural networks (SNNs) provide a unique pathway for capturing the intricacies of temporal data.
Applying SNNs to time-series forecasting is challenging due to difficulties in effective temporal alignment, complexities in encoding processes, and the absence of standardized guidelines for model selection.
We propose a framework for SNNs in time-series forecasting tasks, leveraging the efficiency of spiking neurons in processing temporal information.
arXiv Detail & Related papers (2024-02-02T16:23:50Z) - Fully Spiking Actor Network with Intra-layer Connections for
Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control.
Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer.
To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z) - Temporal Contrastive Learning for Spiking Neural Networks [23.963069990569714]
Biologically inspired neural networks (SNNs) have garnered considerable attention due to their low-energy consumption and better-temporal information processing capabilities.
We propose a novel method to obtain SNNs with low latency and high performance by incorporating contrastive supervision with temporal domain information.
arXiv Detail & Related papers (2023-05-23T10:31:46Z) - Spikformer: When Spiking Neural Network Meets Transformer [102.91330530210037]
We consider two biologically plausible structures, the Spiking Neural Network (SNN) and the self-attention mechanism.
We propose a novel Spiking Self Attention (SSA) as well as a powerful framework, named Spiking Transformer (Spikformer)
arXiv Detail & Related papers (2022-09-29T14:16:49Z) - A Spatial-channel-temporal-fused Attention for Spiking Neural Networks [7.759491656618468]
Spiking neural networks (SNNs) mimic computational strategies, and exhibit substantial capabilities in processing information.
We propose a new spatial-channel-temporal-fused attention (SCTFA) module that can guide SNNs to efficiently capture underlying target regions.
arXiv Detail & Related papers (2022-09-22T07:45:55Z) - On the Intrinsic Structures of Spiking Neural Networks [66.57589494713515]
Recent years have emerged a surge of interest in SNNs owing to their remarkable potential to handle time-dependent and event-driven data.
There has been a dearth of comprehensive studies examining the impact of intrinsic structures within spiking computations.
This work delves deep into the intrinsic structures of SNNs, by elucidating their influence on the expressivity of SNNs.
arXiv Detail & Related papers (2022-06-21T09:42:30Z) - TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks [22.965024490694525]
Spiking Neural Networks (SNNs) are attracting widespread interest due to their biological plausibility, energy efficiency and powerful-temporal information representation ability.
We present a Temporal-Channel Joint Attention mechanism for SNNs, referred to as TCJA-SNN.
The proposed TCJA-SNN framework can effectively assess the significance of spike sequence from both spatial and temporal dimensions.
arXiv Detail & Related papers (2022-06-21T08:16:08Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Spiking Neural Networks for Visual Place Recognition via Weighted
Neuronal Assignments [24.754429120321365]
Spiking neural networks (SNNs) offer both compelling potential advantages, including energy efficiency and low latencies.
One promising area for high performance SNNs is template matching and image recognition.
This research introduces the first high performance SNN for the Visual Place Recognition (VPR) task.
arXiv Detail & Related papers (2021-09-14T05:40:40Z) - A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task.
The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.