Related papers: Low Precision Quantization-aware Training in Spiking Neural Networks with Differentiable Quantization Function

Low Precision Quantization-aware Training in Spiking Neural Networks with Differentiable Quantization Function

URL: http://arxiv.org/abs/2305.19295v1
Date: Tue, 30 May 2023 09:42:05 GMT
Title: Low Precision Quantization-aware Training in Spiking Neural Networks with Differentiable Quantization Function
Authors: Ayan Shymyrbay, Mohammed E. Fouda, and Ahmed Eltawil
Abstract summary: This work aims to bridge the gap between recent progress in quantized neural networks and spiking neural networks. It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions. The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks.
Score: 0.5046831208137847
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks have been proven to be highly effective tools in various domains, yet their computational and memory costs restrict them from being widely deployed on portable devices. The recent rapid increase of edge computing devices has led to an active search for techniques to address the above-mentioned limitations of machine learning frameworks. The quantization of artificial neural networks (ANNs), which converts the full-precision synaptic weights into low-bit versions, emerged as one of the solutions. At the same time, spiking neural networks (SNNs) have become an attractive alternative to conventional ANNs due to their temporal information processing capability, energy efficiency, and high biological plausibility. Despite being driven by the same motivation, the simultaneous utilization of both concepts has yet to be thoroughly studied. Therefore, this work aims to bridge the gap between recent progress in quantized neural networks and SNNs. It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions, exploited in low-bit weight quantization in SNNs. The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks, CIFAR10-DVS, DVS128 Gesture, N-Caltech101, and N-MNIST, for binary networks (64.05\%, 95.45\%, 68.71\%, and 99.43\% respectively) with small accuracy drops and up to 31$\times$ memory savings, which outperforms existing methods.

Related papers

Adaptive Neural Quantum States: A Recurrent Neural Network Perspective [0.7234862895932991]
We show an Adaptive scheme to optimize Neural-network quantum states (NQS)<n>NQS are powerful neural-network ans"atzes that have emerged as promising tools for studying quantum many-body physics.
arXiv Detail & Related papers (2025-07-24T18:00:03Z)
Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation [2.649410674489787]
Spiking Neural Networks (SNNs) are highly energy-efficient during inference. Their ability to process event-driven inputs, such as data from dynamic vision sensors (DVS), further enhances their applicability to edge computing tasks. We present the first benchmark for the DVS gesture recognition task using SNNs optimized for the many-core neuromorphic chip SpiNNaker2.
arXiv Detail & Related papers (2025-04-09T10:09:29Z)
The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks [0.368986335765876]
quantization and pruning of parameters can both compress the model size, reduce memory footprints, and facilitate low-latency execution. We study various combinations of pruning and quantization in isolation, cumulatively, and simultaneously to a state-of-the-art SNN targeting gesture recognition. We show that this state-of-the-art model is amenable to aggressive parameter quantization, not suffering from any loss in accuracy down to ternary weights.
arXiv Detail & Related papers (2023-02-08T16:25:20Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Low-bit Shift Network for End-to-End Spoken Language Understanding [7.851607739211987]
We propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights.
arXiv Detail & Related papers (2022-07-15T14:34:22Z)
Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks [1.131071436917293]
Quantizing parameters and operations to lower bit-precision offers substantial memory and energy savings for neural network inference. This paper proposes a new quantization approach for mixed precision convolutional neural networks (CNNs) targeting edge-computing.
arXiv Detail & Related papers (2022-06-15T18:11:37Z)
A Resource-efficient Spiking Neural Network Accelerator Supporting Emerging Neural Encoding [6.047137174639418]
Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing. SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models. We present a novel hardware architecture that can efficiently support SNN with emerging neural encoding.
arXiv Detail & Related papers (2022-06-06T10:56:25Z)
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM) Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z)
Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts. One promising approach to reduce the energy costs is representing DNNs with low-precision numbers. We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z)
Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks [77.34726150561087]
We introduce Gradient Markov Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
arXiv Detail & Related papers (2020-08-25T15:48:15Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions [6.574517227976925]
Spiking Neural Network (SNN) is a promising approach to energy-efficient computing. We show how to maintain state-of-the-art accuracy when converting a non-spiking network into an SNN.
arXiv Detail & Related papers (2020-02-10T05:24:27Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.