Related papers: Hessian Aware Quantization of Spiking Neural Networks

Hessian Aware Quantization of Spiking Neural Networks

URL: http://arxiv.org/abs/2104.14117v2
Date: Mon, 23 Aug 2021 18:08:01 GMT
Title: Hessian Aware Quantization of Spiking Neural Networks
Authors: Hin Wai Lui and Emre Neftci
Abstract summary: Neuromorphic architecture allows massively parallel computation with variable and local bit-precisions. Current gradient based methods of SNN training use a complex neuron model with multiple state variables. We present a simplified neuron model that reduces the number of state variables by 4-fold while still being compatible with gradient based training.
Score: 1.90365714903665
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: To achieve the low latency, high throughput, and energy efficiency benefits of Spiking Neural Networks (SNNs), reducing the memory and compute requirements when running on a neuromorphic hardware is an important step. Neuromorphic architecture allows massively parallel computation with variable and local bit-precisions. However, how different bit-precisions should be allocated to different layers or connections of the network is not trivial. In this work, we demonstrate how a layer-wise Hessian trace analysis can measure the sensitivity of the loss to any perturbation of the layer's weights, and this can be used to guide the allocation of a layer-specific bit-precision when quantizing an SNN. In addition, current gradient based methods of SNN training use a complex neuron model with multiple state variables, which is not ideal for compute and memory efficiency. To address this challenge, we present a simplified neuron model that reduces the number of state variables by 4-fold while still being compatible with gradient based training. We find that the impact on model accuracy when using a layer-wise bit-precision correlated well with that layer's Hessian trace. The accuracy of the optimal quantized network only dropped by 0.2%, yet the network size was reduced by 58%. This reduces memory usage and allows fixed-point arithmetic with simpler digital circuits to be used, increasing the overall throughput and energy efficiency.

Related papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation [2.649410674489787]
Spiking Neural Networks (SNNs) are highly energy-efficient during inference. Their ability to process event-driven inputs, such as data from dynamic vision sensors (DVS), further enhances their applicability to edge computing tasks. We present the first benchmark for the DVS gesture recognition task using SNNs optimized for the many-core neuromorphic chip SpiNNaker2.
arXiv Detail & Related papers (2025-04-09T10:09:29Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture. To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy. Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Low Precision Quantization-aware Training in Spiking Neural Networks with Differentiable Quantization Function [0.5046831208137847]
This work aims to bridge the gap between recent progress in quantized neural networks and spiking neural networks. It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions. The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks.
arXiv Detail & Related papers (2023-05-30T09:42:05Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Ultra-low Latency Adaptive Local Binary Spiking Neural Network with Accuracy Loss Estimator [4.554628904670269]
We propose an ultra-low latency adaptive local binary spiking neural network (ALBSNN) with accuracy loss estimators. Experimental results show that this method can reduce storage space by more than 20 % without losing network accuracy.
arXiv Detail & Related papers (2022-07-31T09:03:57Z)
Converting Artificial Neural Networks to Spiking Neural Networks via Parameter Calibration [21.117214351356765]
Spiking Neural Network (SNN) is recognized as one of the next-generation neural networks. In this work, we argue that simply copying and pasting the weights of ANN to SNN inevitably results in activation mismatch. We propose a set of layer-wise parameter calibration algorithms, which adjusts the parameters to minimize the activation mismatch.
arXiv Detail & Related papers (2022-05-06T18:22:09Z)
Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks. We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z)
$S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks [41.54155265996312]
Shift neural networks reduce complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values. Our proposed training method pushes the boundaries of shift neural networks and shows 3-bit shift networks out-performs their full-precision counterparts in terms of top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2021-07-07T19:33:02Z)
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training [68.63354877166756]
ActNN is a memory-efficient training framework that stores randomly quantized activations for back propagation. ActNN reduces the memory footprint of the activation by 12x, and it enables training with a 6.6x to 14x larger batch size.
arXiv Detail & Related papers (2021-04-29T05:50:54Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection [21.48875255723581]
A mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression. Experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results.
arXiv Detail & Related papers (2019-12-29T14:11:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.