SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models
- URL: http://arxiv.org/abs/2408.14909v1
- Date: Tue, 27 Aug 2024 09:35:49 GMT
- Title: SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models
- Authors: Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng,
- Abstract summary: We develop spiking state space models (SpikingSSMs) for long sequence learning.
Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block.
We propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds.
- Score: 19.04709216497077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for long sequence tasks, despite their intrinsic temporal dynamics. In this work, we develop spiking state space models (SpikingSSMs) for long sequence learning by leveraging on the sequence learning abilities of state space models (SSMs). Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block, meanwhile realizing sparse synaptic computation. Furthermore, to solve the conflict of event-driven neuronal dynamics with parallel computing, we propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds, enabling orders of acceleration in training speed compared with conventional iterative methods. On the long range arena benchmark task, SpikingSSM achieves competitive performance to state-of-the-art SSMs meanwhile realizing on average 90\% of network sparsity. On language modeling, our network significantly surpasses existing spiking large language models (spikingLLMs) on the WikiText-103 dataset with only a third of the model size, demonstrating its potential as backbone architecture for low computation cost LLMs.
Related papers
- Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences.
By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear.
Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z) - SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning [21.37966285950504]
Spiking neural networks (SNNs) provide an energy-efficient solution by utilizing the spike-based and sparse nature of biological systems.
Recent emergence of state space models (SSMs) offer superior computational efficiency and modeling capability.
We propose a sparse, precise and efficient spiking SSM framework, termed SPikE-SSM.
arXiv Detail & Related papers (2024-10-07T12:20:38Z) - P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks [1.9775291915550175]
Spiking neural networks (SNNs) are posited as a computationally efficient and biologically plausible alternative to conventional neural architectures.
We develop a scalable probabilistic spiking learning framework for long-range dependency tasks.
Our models attain state-of-the-art performance among SNN models across diverse long-range dependency tasks.
arXiv Detail & Related papers (2024-06-05T04:23:11Z) - Fully Spiking Actor Network with Intra-layer Connections for
Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control.
Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer.
To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z) - Learning Long Sequences in Spiking Neural Networks [0.0]
Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient computations.
Recent interest in efficient alternatives to Transformers has given rise to state-of-the-art recurrent architectures named state space models (SSMs)
arXiv Detail & Related papers (2023-12-14T13:30:27Z) - SpikingJelly: An open-source machine learning infrastructure platform
for spike-based intelligence [51.6943465041708]
Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency.
We contribute a full-stack toolkit for pre-processing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips.
arXiv Detail & Related papers (2023-10-25T13:15:17Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Finite Meta-Dynamic Neurons in Spiking Neural Networks for
Spatio-temporal Learning [13.037452551907657]
Spiking Neural Networks (SNNs) have incorporated more biologically-plausible structures and learning principles.
We propose Meta-Dynamic Neurons (MDNs) to improve SNNs for a better network generalization during-temporal learning.
The MDNs generated from a spatial (MNIST) and a temporal (TIts) datasets first and then extended to various other different-temporal tasks.
arXiv Detail & Related papers (2020-10-07T03:49:28Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Exploiting Neuron and Synapse Filter Dynamics in Spatial Temporal
Learning of Deep Spiking Neural Network [7.503685643036081]
A bio-plausible SNN model with spatial-temporal property is a complex dynamic system.
We formulate SNN as a network of infinite impulse response (IIR) filters with neuron nonlinearity.
We propose a training algorithm that is capable to learn spatial-temporal patterns by searching for the optimal synapse filter kernels and weights.
arXiv Detail & Related papers (2020-02-19T01:27:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.