Spiking Transformer with Spatial-Temporal Attention
- URL: http://arxiv.org/abs/2409.19764v1
- Date: Sun, 29 Sep 2024 20:29:39 GMT
- Title: Spiking Transformer with Spatial-Temporal Attention
- Authors: Donghyun Lee, Yuhang Li, Youngeun Kim, Shiting Xiao, Priyadarshini Panda,
- Abstract summary: Spiking Neural Networks (SNNs) present a compelling and energy-efficient alternative to traditional Artificial Neural Networks (ANNs)
We introduce Spiking Transformer with Spatial-Temporal Attention (STAtten), a simple and straightforward architecture designed to integrate spatial and temporal information in self-attention.
We first verify our spatial-temporal attention mechanism's ability to capture long-term temporal dependencies using sequential datasets.
- Score: 26.7175155847563
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Spiking Neural Networks (SNNs) present a compelling and energy-efficient alternative to traditional Artificial Neural Networks (ANNs) due to their sparse binary activation. Leveraging the success of the transformer architecture, the spiking transformer architecture is explored to scale up dataset size and performance. However, existing works only consider the spatial self-attention in spiking transformer, neglecting the inherent temporal context across the timesteps. In this work, we introduce Spiking Transformer with Spatial-Temporal Attention (STAtten), a simple and straightforward architecture designed to integrate spatial and temporal information in self-attention with negligible additional computational load. The STAtten divides the temporal or token index and calculates the self-attention in a cross-manner to effectively incorporate spatial-temporal information. We first verify our spatial-temporal attention mechanism's ability to capture long-term temporal dependencies using sequential datasets. Moreover, we validate our approach through extensive experiments on varied datasets, including CIFAR10/100, ImageNet, CIFAR10-DVS, and N-Caltech101. Notably, our cross-attention mechanism achieves an accuracy of 78.39 % on the ImageNet dataset.
Related papers
- Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic Forecasting [13.309018047313801]
Traffic forecasting has emerged as a crucial research area in the development of smart cities.
Recent advancements in network modeling for most-temporal correlations are starting to see diminishing returns in performance.
To tackle these challenges, we introduce the Spatio-Temporal Graph Transformer (STGormer)
We design two straightforward yet effective spatial encoding methods based on the structure and integrate time position into the vanilla transformer to capture-temporal traffic patterns.
arXiv Detail & Related papers (2024-08-20T13:18:21Z) - Temporal Reversed Training for Spiking Neural Networks with Generalized Spatio-Temporal Representation [3.5624857747396814]
Spi neural networks (SNNs) have received widespread attention as an ultra-low energy computing paradigm.
Recent studies have focused on improving the feature extraction capability of SNNs, but they suffer from inefficient and suboptimal performance.
We.
propose a simple yet effective temporal reversed training (TRT) method to optimize the temporal performance of SNNs.
arXiv Detail & Related papers (2024-08-17T06:23:38Z) - Enhancing Adaptive History Reserving by Spiking Convolutional Block
Attention Module in Recurrent Neural Networks [21.509659756334802]
Spiking neural networks (SNNs) serve as one type of efficient model to processtemporal-temporal patterns in time series.
In this paper, we develop a recurrent spiking neural network (RSNN) model embedded with an advanced spiking convolutional attention module (SCBAM) component.
It invokes the history information in spatial and temporal channels adaptively through SCBAM which brings the advantages of efficient memory calling history and redundancy elimination.
arXiv Detail & Related papers (2024-01-08T08:05:34Z) - Disentangling Spatial and Temporal Learning for Efficient Image-to-Video
Transfer Learning [59.26623999209235]
We present DiST, which disentangles the learning of spatial and temporal aspects of videos.
The disentangled learning in DiST is highly efficient because it avoids the back-propagation of massive pre-trained parameters.
Extensive experiments on five benchmarks show that DiST delivers better performance than existing state-of-the-art methods by convincing gaps.
arXiv Detail & Related papers (2023-09-14T17:58:33Z) - Spatiotemporal convolutional network for time-series prediction and
causal inference [21.895413699349966]
A neural network computing framework, i.N.N., was developed to efficiently and accurately render a multistep-ahead prediction of a time series.
The framework has great potential in practical applications in artificial intelligence (AI) or machine learning fields.
arXiv Detail & Related papers (2021-07-03T06:20:43Z) - Adaptive Latent Space Tuning for Non-Stationary Distributions [62.997667081978825]
We present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs.
We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator.
arXiv Detail & Related papers (2021-05-08T03:50:45Z) - Temporal Memory Relation Network for Workflow Recognition from Surgical
Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns.
We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Multivariate Time Series Classification Using Spiking Neural Networks [7.273181759304122]
Spiking neural network has drawn attention as it enables low power consumption.
We present an encoding scheme to convert time series into sparse spatial temporal spike patterns.
A training algorithm to classify spatial temporal patterns is also proposed.
arXiv Detail & Related papers (2020-07-07T15:24:01Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.