Related papers: Navigating Local Minima in Quantized Spiking Neural Networks

Navigating Local Minima in Quantized Spiking Neural Networks

URL: http://arxiv.org/abs/2202.07221v1
Date: Tue, 15 Feb 2022 06:42:25 GMT
Title: Navigating Local Minima in Quantized Spiking Neural Networks
Authors: Jason K. Eshraghian, Corey Lammie, Mostafa Rahimi Azghadi, Wei D. Lu
Abstract summary: Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms. These networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds. This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation.
Score: 3.1351527202068445
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms. However, these networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds. The broadly accepted trick to overcoming this is through the use of biased gradient estimators: surrogate gradients which approximate thresholding in Spiking Neural Networks (SNNs), and Straight-Through Estimators (STEs), which completely bypass thresholding in Quantized Neural Networks (QNNs). While noisy gradient feedback has enabled reasonable performance on simple supervised learning tasks, it is thought that such noise increases the difficulty of finding optima in loss landscapes, especially during the later stages of optimization. By periodically boosting the Learning Rate (LR) during training, we expect the network can navigate unexplored solution spaces that would otherwise be difficult to reach due to local minima, barriers, or flat surfaces. This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation as applied to Quantized SNNs (QSNNs). We provide a rigorous empirical evaluation of this technique on high precision and 4-bit quantized SNNs across three datasets, demonstrating (close to) state-of-the-art performance on the more complex datasets. Our source code is available at this link: https://github.com/jeshraghian/QSNNs.

Related papers

A Numerical Gradient Inversion Attack in Variational Quantum Neural-Networks [4.086403209504347]
The loss landscape of Variational Quantum Neural Networks (VQNNs) is characterized by local minima that grow exponentially with increasing qubits. We present a numerical scheme that successfully reconstructs input training, real-world, practical data from trainable VQNNs' gradients.
arXiv Detail & Related papers (2025-04-17T10:12:38Z)
Parallel Hyperparameter Optimization Of Spiking Neural Network [0.5371337604556311]
Spiking Neural Networks (SNNs) are based on a more biologically inspired approach than usual artificial neural networks. We tackle the signal loss issue of SNNs to what we call silent networks. By defining an early stopping criterion, we were able to instantiate larger and more flexible search spaces.
arXiv Detail & Related papers (2024-03-01T11:11:59Z)
Low Latency of object detection for spikng neural network [3.404826786562694]
Spiking Neural Networks are well-suited for edge AI applications due to their binary spike nature. In this paper, we focus on generating highly accurate and low-latency SNNs specifically for object detection.
arXiv Detail & Related papers (2023-09-27T10:26:19Z)
Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network. We provide analytical expressions for these speed limits for linear and linearizable neural networks. Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Hoyer regularizer is all you need for ultra low-latency spiking neural networks [4.243356707599485]
Spiking Neural networks (SNN) have emerged as an attractive Hoyer-temporal computing paradigm for a wide range of low-power vision tasks. We present a training framework (from scratch) for one-time SNNs that uses a novel variant of the recently proposed regularizer. Our approach outperforms existing spiking, binary, and adder neural networks in terms of the accuracy-FLOPs trade-off for complex image recognition tasks.
arXiv Detail & Related papers (2022-12-20T11:16:06Z)
Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z)
Ultra-low Latency Adaptive Local Binary Spiking Neural Network with Accuracy Loss Estimator [4.554628904670269]
We propose an ultra-low latency adaptive local binary spiking neural network (ALBSNN) with accuracy loss estimators. Experimental results show that this method can reduce storage space by more than 20 % without losing network accuracy.
arXiv Detail & Related papers (2022-07-31T09:03:57Z)
SpikeMS: Deep Spiking Neural Network for Motion Segmentation [7.491944503744111]
textitSpikeMS is the first deep encoder-decoder SNN architecture for the real-world large-scale problem of motion segmentation. We show that textitSpikeMS is capable of textitincremental predictions, or predictions from smaller amounts of test data than it is trained on.
arXiv Detail & Related papers (2021-05-13T21:34:55Z)
Local Critic Training for Model-Parallel Learning of Deep Neural Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training. We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z)
Selfish Sparse RNN Training [13.165729746380816]
We propose an approach to train sparse RNNs with a fixed parameter count in one single run, without compromising performance. We achieve state-of-the-art sparse training results with various datasets on Penn TreeBank and Wikitext-2.
arXiv Detail & Related papers (2021-01-22T10:45:40Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.