Exact Gradient Computation for Spiking Neural Networks Through Forward
Propagation
- URL: http://arxiv.org/abs/2210.15415v1
- Date: Tue, 18 Oct 2022 20:28:21 GMT
- Title: Exact Gradient Computation for Spiking Neural Networks Through Forward
Propagation
- Authors: Jane H. Lee, Saeid Haghighatshoar, Amin Karbasi
- Abstract summary: Spiking neural networks (SNN) have emerged as alternatives to traditional neural networks.
We propose a novel training algorithm, called emphforward propagation (FP), that computes exact gradients for SNN.
- Score: 39.33537954568678
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Spiking neural networks (SNN) have recently emerged as alternatives to
traditional neural networks, owing to energy efficiency benefits and capacity
to better capture biological neuronal mechanisms. However, the classic
backpropagation algorithm for training traditional networks has been
notoriously difficult to apply to SNN due to the hard-thresholding and
discontinuities at spike times. Therefore, a large majority of prior work
believes exact gradients for SNN w.r.t. their weights do not exist and has
focused on approximation methods to produce surrogate gradients. In this paper,
(1) by applying the implicit function theorem to SNN at the discrete spike
times, we prove that, albeit being non-differentiable in time, SNNs have
well-defined gradients w.r.t. their weights, and (2) we propose a novel
training algorithm, called \emph{forward propagation} (FP), that computes exact
gradients for SNN. FP exploits the causality structure between the spikes and
allows us to parallelize computation forward in time. It can be used with other
algorithms that simulate the forward pass, and it also provides insights on why
other related algorithms such as Hebbian learning and also recently-proposed
surrogate gradient methods may perform well.
Related papers
- Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient [8.516243389583702]
Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features.
We propose Masked Surrogate Gradients (MSGs) to balance the effectiveness of training and the sparseness of the gradient, thereby improving the generalization ability of SNNs.
arXiv Detail & Related papers (2024-06-28T04:21:32Z) - Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network.
We provide analytical expressions for these speed limits for linear and linearizable neural networks.
Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Temporal Efficient Training of Spiking Neural Network via Gradient
Re-weighting [29.685909045226847]
Brain-inspired spiking neuron networks (SNNs) have attracted widespread research interest because of their event-driven and energy-efficient characteristics.
Current direct training approach with surrogate gradient results in SNNs with poor generalizability.
We introduce the temporal efficient training (TET) approach to compensate for the loss of momentum in the gradient descent with SG.
arXiv Detail & Related papers (2022-02-24T08:02:37Z) - BioGrad: Biologically Plausible Gradient-Based Learning for Spiking
Neural Networks [0.0]
Spiking neural networks (SNN) are delivering energy-efficient, massively parallel, and low-latency solutions to AI problems.
To harness these computational benefits, SNN need to be trained by learning algorithms that adhere to brain-inspired neuromorphic principles.
We propose a biologically plausible gradient-based learning algorithm for SNN that is functionally equivalent to backprop.
arXiv Detail & Related papers (2021-10-27T00:07:25Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding [26.654533157221973]
This paper introduces the concept of time-to-first-spike coding into deep SNNs using the kernel-based dynamic threshold and dendrite to overcome the drawback.
According to our results, the proposed methods can reduce inference latency and number of spikes to 22% and less than 1%, compared to those of burst coding.
arXiv Detail & Related papers (2020-03-26T04:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.