Temporal Efficient Training of Spiking Neural Network via Gradient
Re-weighting
- URL: http://arxiv.org/abs/2202.11946v1
- Date: Thu, 24 Feb 2022 08:02:37 GMT
- Title: Temporal Efficient Training of Spiking Neural Network via Gradient
Re-weighting
- Authors: Shikuang Deng, Yuhang Li, Shanghang Zhang, Shi Gu
- Abstract summary: Brain-inspired spiking neuron networks (SNNs) have attracted widespread research interest because of their event-driven and energy-efficient characteristics.
Current direct training approach with surrogate gradient results in SNNs with poor generalizability.
We introduce the temporal efficient training (TET) approach to compensate for the loss of momentum in the gradient descent with SG.
- Score: 29.685909045226847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, brain-inspired spiking neuron networks (SNNs) have attracted
widespread research interest because of their event-driven and energy-efficient
characteristics. Still, it is difficult to efficiently train deep SNNs due to
the non-differentiability of its activation function, which disables the
typically used gradient descent approaches for traditional artificial neural
networks (ANNs). Although the adoption of surrogate gradient (SG) formally
allows for the back-propagation of losses, the discrete spiking mechanism
actually differentiates the loss landscape of SNNs from that of ANNs, failing
the surrogate gradient methods to achieve comparable accuracy as for ANNs. In
this paper, we first analyze why the current direct training approach with
surrogate gradient results in SNNs with poor generalizability. Then we
introduce the temporal efficient training (TET) approach to compensate for the
loss of momentum in the gradient descent with SG so that the training process
can converge into flatter minima with better generalizability. Meanwhile, we
demonstrate that TET improves the temporal scalability of SNN and induces a
temporal inheritable training for acceleration. Our method consistently
outperforms the SOTA on all reported mainstream datasets, including
CIFAR-10/100 and ImageNet. Remarkably on DVS-CIFAR10, we obtained 83$\%$ top-1
accuracy, over 10$\%$ improvement compared to existing state of the art. Codes
are available at \url{https://github.com/Gus-Lab/temporal_efficient_training}.
Related papers
- Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient [8.516243389583702]
Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features.
We propose Masked Surrogate Gradients (MSGs) to balance the effectiveness of training and the sparseness of the gradient, thereby improving the generalization ability of SNNs.
arXiv Detail & Related papers (2024-06-28T04:21:32Z) - FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion [16.9748086865693]
Spiking Neural Networks (SNNs) offer a promising avenue for energy-efficient computing compared with Artificial Neural Networks (ANNs)
In this work, we introduce a lightweight Forward Temporal Bias (FTBC) technique, aimed at enhancing conversion accuracy without the computational overhead.
We further propose an algorithm for finding the temporal bias only in the forward pass, thus eliminating the computational burden of backpropagation.
arXiv Detail & Related papers (2024-03-27T09:25:20Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural
Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware.
We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z) - Exact Gradient Computation for Spiking Neural Networks Through Forward
Propagation [39.33537954568678]
Spiking neural networks (SNN) have emerged as alternatives to traditional neural networks.
We propose a novel training algorithm, called emphforward propagation (FP), that computes exact gradients for SNN.
arXiv Detail & Related papers (2022-10-18T20:28:21Z) - Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models.
Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency.
We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep
Spiking Neural Networks by Training with Crafted Input Noise [13.904091056365765]
We present an SNN training algorithm that uses crafted input noise and incurs no additional training time.
Compared to standard trained direct input SNNs, our trained models yield improved classification accuracy of up to 13.7%.
Our models also outperform inherently robust SNNs trained on rate-coded inputs with improved or similar classification performance on attack-generated images.
arXiv Detail & Related papers (2021-10-06T16:48:48Z) - Gradient Descent on Neural Networks Typically Occurs at the Edge of
Stability [94.4070247697549]
Full-batch gradient descent on neural network training objectives operates in a regime we call the Edge of Stability.
In this regime, the maximum eigenvalue of the training loss Hessian hovers just above the numerical value $2 / text(step size)$, and the training loss behaves non-monotonically over short timescales, yet consistently decreases over long timescales.
arXiv Detail & Related papers (2021-02-26T22:08:19Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.