SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning
- URL: http://arxiv.org/abs/2410.17268v1
- Date: Mon, 07 Oct 2024 12:20:38 GMT
- Title: SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning
- Authors: Yan Zhong, Ruoyu Zhao, Chao Wang, Qinghai Guo, Jianguo Zhang, Zhichao Lu, Luziwei Leng,
- Abstract summary: Spiking neural networks (SNNs) provide an energy-efficient solution by utilizing the spike-based and sparse nature of biological systems.
Recent emergence of state space models (SSMs) offer superior computational efficiency and modeling capability.
We propose a sparse, precise and efficient spiking SSM framework, termed SPikE-SSM.
- Score: 21.37966285950504
- License:
- Abstract: Spiking neural networks (SNNs) provide an energy-efficient solution by utilizing the spike-based and sparse nature of biological systems. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on long sequential tasks, until the recent emergence of state space models (SSMs), which offer superior computational efficiency and modeling capability. However, applying the highly capable SSMs to SNNs for long sequences learning poses three major challenges: (1) The membrane potential is determined by the past spiking history of the neuron, leading to reduced efficiency for sequence modeling in parallel computing scenarios. (2) Complex dynamics of biological spiking neurons are crucial for functionality but challenging to simulate and exploit effectively in large networks. (3) It is arduous to maintain high sparsity while achieving high accuracy for spiking neurons without resorting to dense computing, as utilized in artificial neuron-based SSMs. To address them, we propose a sparse, precise and efficient spiking SSM framework, termed SPikE-SSM. For (1), we propose a boundary compression strategy (PMBC) to accelerate the inference of the spiking neuron model, enabling parallel processing for long sequence learning. For (2), we propose a novel and concise neuron model incorporating reset-refractory mechanism to leverage the inherent temporal dimension for dynamic computing with biological interpretability. For (3), we hierarchically integrate the proposed neuron model to the original SSM block, and enhance the dynamics of SPikE-SSM by incorporating trainable thresholds and refractory magnitudes to balance accuracy and sparsity. Extensive experiments verify the effectiveness and robustness of SPikE-SSM on the long range arena benchmarks and large language dataset WikiText-103, showing the potential of dynamic spiking neurons in efficient long sequence learning.
Related papers
- Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences.
By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear.
Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z) - Time-independent Spiking Neuron via Membrane Potential Estimation for Efficient Spiking Neural Networks [4.142699381024752]
computational inefficiency of spiking neural networks (SNNs) is primarily due to the sequential updates of membrane potential.
We propose Membrane Potential Estimation Parallel Spiking Neurons (MPE-PSN), a parallel computation method for spiking neurons.
Our approach exhibits promise for enhancing computational efficiency, particularly under conditions of elevated neuron density.
arXiv Detail & Related papers (2024-09-08T05:14:22Z) - SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models [19.04709216497077]
We develop spiking state space models (SpikingSSMs) for long sequence learning.
Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block.
We propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds.
arXiv Detail & Related papers (2024-08-27T09:35:49Z) - Enhancing learning in spiking neural networks through neuronal heterogeneity and neuromodulatory signaling [52.06722364186432]
We propose a biologically-informed framework for enhancing artificial neural networks (ANNs)
Our proposed dual-framework approach highlights the potential of spiking neural networks (SNNs) for emulating diverse spiking behaviors.
We outline how the proposed approach integrates brain-inspired compartmental models and task-driven SNNs, bioinspiration and complexity.
arXiv Detail & Related papers (2024-07-05T14:11:28Z) - P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks [1.9775291915550175]
Spiking neural networks (SNNs) are posited as a computationally efficient and biologically plausible alternative to conventional neural architectures.
We develop a scalable probabilistic spiking learning framework for long-range dependency tasks.
Our models attain state-of-the-art performance among SNN models across diverse long-range dependency tasks.
arXiv Detail & Related papers (2024-06-05T04:23:11Z) - Autaptic Synaptic Circuit Enhances Spatio-temporal Predictive Learning of Spiking Neural Networks [23.613277062707844]
Spiking Neural Networks (SNNs) emulate the integrated-fire-leak mechanism found in biological neurons.
Existing SNNs predominantly rely on the Integrate-and-Fire Leaky (LIF) model.
This paper proposes a novel S-patioTemporal Circuit (STC) model.
arXiv Detail & Related papers (2024-06-01T11:17:27Z) - Single Neuromorphic Memristor closely Emulates Multiple Synaptic
Mechanisms for Energy Efficient Neural Networks [71.79257685917058]
We demonstrate memristive nano-devices based on SrTiO3 that inherently emulate all these synaptic functions.
These memristors operate in a non-filamentary, low conductance regime, which enables stable and energy efficient operation.
arXiv Detail & Related papers (2024-02-26T15:01:54Z) - SpikingJelly: An open-source machine learning infrastructure platform
for spike-based intelligence [51.6943465041708]
Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency.
We contribute a full-stack toolkit for pre-processing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips.
arXiv Detail & Related papers (2023-10-25T13:15:17Z) - Unleashing the Potential of Spiking Neural Networks for Sequential
Modeling with Contextual Embedding [32.25788551849627]
Brain-inspired spiking neural networks (SNNs) have struggled to match their biological counterpart in modeling long-term temporal relationships.
This paper presents a novel Contextual Embedding Leaky Integrate-and-Fire (CE-LIF) spiking neuron model.
arXiv Detail & Related papers (2023-08-29T09:33:10Z) - Long Short-term Memory with Two-Compartment Spiking Neuron [64.02161577259426]
We propose a novel biologically inspired Long Short-Term Memory Leaky Integrate-and-Fire spiking neuron model, dubbed LSTM-LIF.
Our experimental results, on a diverse range of temporal classification tasks, demonstrate superior temporal classification capability, rapid training convergence, strong network generalizability, and high energy efficiency of the proposed LSTM-LIF model.
This work, therefore, opens up a myriad of opportunities for resolving challenging temporal processing tasks on emerging neuromorphic computing machines.
arXiv Detail & Related papers (2023-07-14T08:51:03Z) - The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks [64.08042492426992]
We introduce the Expressive Memory (ELM) neuron model, a biologically inspired model of a cortical neuron.
Our ELM neuron can accurately match the aforementioned input-output relationship with under ten thousand trainable parameters.
We evaluate it on various tasks with demanding temporal structures, including the Long Range Arena (LRA) datasets.
arXiv Detail & Related papers (2023-06-14T13:34:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.