Related papers: One Timestep is All You Need: Training Spiking Neural Networks with Ultra Low Latency

One Timestep is All You Need: Training Spiking Neural Networks with Ultra Low Latency

URL: http://arxiv.org/abs/2110.05929v1
Date: Fri, 1 Oct 2021 22:54:59 GMT
Title: One Timestep is All You Need: Training Spiking Neural Networks with Ultra Low Latency
Authors: Sayeed Shafayet Chowdhury, Nitin Rathi and Kaushik Roy
Abstract summary: Spiking Neural Networks (SNNs) are energy efficient alternatives to commonly used deep neural networks (DNNs) High inference latency is a significant hindrance to the edge deployment of deep SNNs. We propose an Iterative Initialization and Retraining method for SNNs (IIR-SNN) to perform single shot inference in the temporal axis.
Score: 8.590196535871343
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spiking Neural Networks (SNNs) are energy efficient alternatives to commonly used deep neural networks (DNNs). Through event-driven information processing, SNNs can reduce the expensive compute requirements of DNNs considerably, while achieving comparable performance. However, high inference latency is a significant hindrance to the edge deployment of deep SNNs. Computation over multiple timesteps not only increases latency as well as overall energy budget due to higher number of operations, but also incurs memory access overhead of fetching membrane potentials, both of which lessen the energy benefits of SNNs. To overcome this bottleneck and leverage the full potential of SNNs, we propose an Iterative Initialization and Retraining method for SNNs (IIR-SNN) to perform single shot inference in the temporal axis. The method starts with an SNN trained with T timesteps (T>1). Then at each stage of latency reduction, the network trained at previous stage with higher timestep is utilized as initialization for subsequent training with lower timestep. This acts as a compression method, as the network is gradually shrunk in the temporal domain. In this paper, we use direct input encoding and choose T=5, since as per literature, it is the minimum required latency to achieve satisfactory performance on ImageNet. The proposed scheme allows us to obtain SNNs with up to unit latency, requiring a single forward pass during inference. We achieve top-1 accuracy of 93.05%, 70.15% and 67.71% on CIFAR-10, CIFAR-100 and ImageNet, respectively using VGG16, with just 1 timestep. In addition, IIR-SNNs perform inference with 5-2500X reduced latency compared to other state-of-the-art SNNs, maintaining comparable or even better accuracy. Furthermore, in comparison with standard DNNs, the proposed IIR-SNNs provide25-33X higher energy efficiency, while being comparable to them in classification performance.

Related papers

Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks [50.32980443749865]
Spiking neural networks (SNNs) have garnered significant attention for their low power consumption and high biologicalability. Current SNNs struggle to balance accuracy and latency in neuromorphic datasets. We propose Step-wise Distillation (HSD) method, tailored for neuromorphic datasets.
arXiv Detail & Related papers (2024-09-19T06:52:34Z)
SEENN: Towards Temporal Spiking Early-Exit Neural Networks [26.405775809170308]
Spiking Neural Networks (SNNs) have recently become more popular as a biologically plausible substitute for traditional Artificial Neural Networks (ANNs) We study a fine-grained adjustment of the number of timesteps in SNNs. By dynamically adjusting the number of timesteps, our SEENN achieves a remarkable reduction in the average number of timesteps during inference.
arXiv Detail & Related papers (2023-04-02T15:57:09Z)
Optimising Event-Driven Spiking Neural Network with Regularisation and Cutoff [31.61525648918492]
Spiking neural network (SNN) offer a closer mimicry of natural neural networks. Current SNN is trained to infer over a fixed duration. We propose a cutoff in SNN, which can terminate SNN anytime during inference to achieve efficient inference.
arXiv Detail & Related papers (2023-01-23T16:14:09Z)
SNN2ANN: A Fast and Memory-Efficient Training Framework for Spiking Neural Networks [117.56823277328803]
Spiking neural networks are efficient computation models for low-power environments. We propose a SNN-to-ANN (SNN2ANN) framework to train the SNN in a fast and memory-efficient way. Experiment results show that our SNN2ANN-based models perform well on the benchmark datasets.
arXiv Detail & Related papers (2022-06-19T16:52:56Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Optimized Potential Initialization for Low-latency Spiking Neural Networks [21.688402090967497]
Spiking Neural Networks (SNNs) have been attached great importance due to the distinctive properties of low power consumption, biological plausibility, and adversarial robustness. The most effective way to train deep SNNs is through ANN-to-SNN conversion, which have yielded the best performance in deep network structure and large-scale datasets. In this paper, we aim to achieve high-performance converted SNNs with extremely low latency (fewer than 32 time-steps)
arXiv Detail & Related papers (2022-02-03T07:15:43Z)
Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking Neural Networks? [3.2108350580418166]
Spiking neural networks (SNNs) operate via binary spikes distributed over time. SOTA training strategies for SNNs involve conversion from a non-spiking deep neural network (DNN) We propose a new training algorithm that accurately captures these distributions, minimizing the error between the DNN and converted SNN.
arXiv Detail & Related papers (2021-12-22T18:47:45Z)
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks [6.011954485684313]
Spiking Neural Networks (SNNs) are a promising alternative to traditional deep learning methods. However, a major drawback of SNNs is high inference latency. In this paper, we propose spatial and temporal pruning of SNNs.
arXiv Detail & Related papers (2021-04-26T12:50:58Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference to ANN-Level Accuracy [51.861168222799186]
Spiking Neural Networks (SNNs) are a type of neuromorphic, or brain-inspired network. SNNs are sparse, accessing very few weights, and typically only use addition operations instead of the more power-intensive multiply-and-accumulate operations. In this work, we aim to overcome the limitations of TTFS-encoded neuromorphic systems.
arXiv Detail & Related papers (2020-06-03T15:55:53Z)
T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding [26.654533157221973]
This paper introduces the concept of time-to-first-spike coding into deep SNNs using the kernel-based dynamic threshold and dendrite to overcome the drawback. According to our results, the proposed methods can reduce inference latency and number of spikes to 22% and less than 1%, compared to those of burst coding.
arXiv Detail & Related papers (2020-03-26T04:39:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.