Navigating Local Minima in Quantized Spiking Neural Networks
- URL: http://arxiv.org/abs/2202.07221v1
- Date: Tue, 15 Feb 2022 06:42:25 GMT
- Title: Navigating Local Minima in Quantized Spiking Neural Networks
- Authors: Jason K. Eshraghian, Corey Lammie, Mostafa Rahimi Azghadi, Wei D. Lu
- Abstract summary: Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms.
These networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds.
This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation.
- Score: 3.1351527202068445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spiking and Quantized Neural Networks (NNs) are becoming exceedingly
important for hyper-efficient implementations of Deep Learning (DL) algorithms.
However, these networks face challenges when trained using error
backpropagation, due to the absence of gradient signals when applying hard
thresholds. The broadly accepted trick to overcoming this is through the use of
biased gradient estimators: surrogate gradients which approximate thresholding
in Spiking Neural Networks (SNNs), and Straight-Through Estimators (STEs),
which completely bypass thresholding in Quantized Neural Networks (QNNs). While
noisy gradient feedback has enabled reasonable performance on simple supervised
learning tasks, it is thought that such noise increases the difficulty of
finding optima in loss landscapes, especially during the later stages of
optimization. By periodically boosting the Learning Rate (LR) during training,
we expect the network can navigate unexplored solution spaces that would
otherwise be difficult to reach due to local minima, barriers, or flat
surfaces. This paper presents a systematic evaluation of a cosine-annealed LR
schedule coupled with weight-independent adaptive moment estimation as applied
to Quantized SNNs (QSNNs). We provide a rigorous empirical evaluation of this
technique on high precision and 4-bit quantized SNNs across three datasets,
demonstrating (close to) state-of-the-art performance on the more complex
datasets. Our source code is available at this link:
https://github.com/jeshraghian/QSNNs.
Related papers
- Parallel Hyperparameter Optimization Of Spiking Neural Network [0.5371337604556311]
Spiking Neural Networks (SNNs) are based on a more biologically inspired approach than usual artificial neural networks.
We tackle the signal loss issue of SNNs to what we call silent networks.
By defining an early stopping criterion, we were able to instantiate larger and more flexible search spaces.
arXiv Detail & Related papers (2024-03-01T11:11:59Z) - Low Latency of object detection for spikng neural network [3.404826786562694]
Spiking Neural Networks are well-suited for edge AI applications due to their binary spike nature.
In this paper, we focus on generating highly accurate and low-latency SNNs specifically for object detection.
arXiv Detail & Related papers (2023-09-27T10:26:19Z) - Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network.
We provide analytical expressions for these speed limits for linear and linearizable neural networks.
Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Hoyer regularizer is all you need for ultra low-latency spiking neural
networks [4.243356707599485]
Spiking Neural networks (SNN) have emerged as an attractive Hoyer-temporal computing paradigm for a wide range of low-power vision tasks.
We present a training framework (from scratch) for one-time SNNs that uses a novel variant of the recently proposed regularizer.
Our approach outperforms existing spiking, binary, and adder neural networks in terms of the accuracy-FLOPs trade-off for complex image recognition tasks.
arXiv Detail & Related papers (2022-12-20T11:16:06Z) - Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models.
Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency.
We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z) - Ultra-low Latency Adaptive Local Binary Spiking Neural Network with
Accuracy Loss Estimator [4.554628904670269]
We propose an ultra-low latency adaptive local binary spiking neural network (ALBSNN) with accuracy loss estimators.
Experimental results show that this method can reduce storage space by more than 20 % without losing network accuracy.
arXiv Detail & Related papers (2022-07-31T09:03:57Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Selfish Sparse RNN Training [13.165729746380816]
We propose an approach to train sparse RNNs with a fixed parameter count in one single run, without compromising performance.
We achieve state-of-the-art sparse training results with various datasets on Penn TreeBank and Wikitext-2.
arXiv Detail & Related papers (2021-01-22T10:45:40Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.