Related papers: Optimal Spiking Brain Compression: Improving One-Shot Post-Training Pruning and Quantization for Spiking Neural Networks

Optimal Spiking Brain Compression: Improving One-Shot Post-Training Pruning and Quantization for Spiking Neural Networks

URL: http://arxiv.org/abs/2506.03996v1
Date: Wed, 04 Jun 2025 14:23:05 GMT
Title: Optimal Spiking Brain Compression: Improving One-Shot Post-Training Pruning and Quantization for Spiking Neural Networks
Authors: Lianfeng Shi, Ao Li, Benjamin Ward-Cherrier,
Abstract summary: Spiking Neural Networks (SNNs) have emerged as a new generation of energy-efficient neural networks suitable for implementation on neuromorphic hardware.<n>Weight pruning and quantization have recently been explored to improve SNNs' efficiency.<n>We propose a new one-shot post-training pruning/quantization framework, Optimal Spiking Brain Compression (OSBC) for SNNs.
Score: 2.3222699639842244
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spiking Neural Networks (SNNs) have emerged as a new generation of energy-efficient neural networks suitable for implementation on neuromorphic hardware. As neuromorphic hardware has limited memory and computing resources, weight pruning and quantization have recently been explored to improve SNNs' efficiency. State-of-the-art SNN pruning/quantization methods employ multiple compression and training iterations, increasing the cost for pre-trained or very large SNNs. In this paper, we propose a new one-shot post-training pruning/quantization framework, Optimal Spiking Brain Compression (OSBC), that adapts the Optimal Brain Compression (OBC) method of [Frantar, Singh, and Alistarh, 2023] for SNNs. Rather than minimizing the loss on neuron input current as OBC does, OSBC achieves more efficient and accurate SNN compression in one pass by minimizing the loss on spiking neuron membrane potential with a small sample dataset. Our experiments on neuromorphic datasets (N-MNIST, CIFAR10-DVS, DVS128-Gesture) demonstrate that OSBC can achieve 97% sparsity through pruning with 1.41%, 10.20%, and 1.74% accuracy loss, or 4-bit symmetric quantization with 0.17%, 1.54%, and 7.71% accuracy loss, respectively. Code will be available on GitHub.

Related papers

Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding [56.066799081747845]
The ever-growing size of neural networks poses serious challenges on resource-constrained devices.<n>We propose a novel post-training compression framework that combines rate-aware quantization with entropy coding.<n>Our method allows for very fast decoding and is compatible with arbitrary quantization grids.
arXiv Detail & Related papers (2025-05-24T15:52:49Z)
Decoding finger velocity from cortical spike trains with recurrent spiking neural networks [6.404492073110551]
Invasive brain-machine interfaces (BMIs) can significantly improve the life quality of motor-impaired patients. BMIs must meet strict latency and energy constraints while providing reliable decoding performance. We trained RSNNs to decode finger velocity from cortical spike trains of two macaque monkeys.
arXiv Detail & Related papers (2024-09-03T10:15:33Z)
A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks [53.31941519245432]
Brain-inspired spiking neural networks (SNNs) have demonstrated promising capabilities in solving pattern recognition tasks. These SNNs are grounded on homogeneous neurons that utilize a uniform neural coding for information representation. In this study, we argue that SNN architectures should be holistically designed to incorporate heterogeneous coding schemes.
arXiv Detail & Related papers (2023-05-26T02:52:12Z)
Adaptive Sparse Structure Development with Pruning and Regeneration for Spiking Neural Networks [6.760855795263126]
Spiking Neural Networks (SNNs) have the natural advantage of drawing the sparse structural plasticity of brain development to alleviate the energy problems of deep neural networks. This paper proposed a novel method for the adaptive structural development of SNN, introducing dendritic spine plasticity-based synaptic constraint, neuronal pruning and synaptic regeneration.
arXiv Detail & Related papers (2022-11-22T12:23:30Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs. SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space. Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z)
Spiking Neural Networks with Improved Inherent Recurrence Dynamics for Sequential Learning [6.417011237981518]
Spiking neural networks (SNNs) with leaky integrate and fire (LIF) neurons can be operated in an event-driven manner. We show that SNNs can be trained for sequential tasks and propose modifications to a network of LIF neurons. We then develop a training scheme to train the proposed SNNs with improved inherent recurrence dynamics.
arXiv Detail & Related papers (2021-09-04T17:13:28Z)
Pruning of Deep Spiking Neural Networks through Gradient Rewiring [41.64961999525415]
Spiking Neural Networks (SNNs) have been attached great importance due to their biological plausibility and high energy-efficiency on neuromorphic chips. Most existing methods directly apply pruning approaches in artificial neural networks (ANNs) to SNNs, which ignore the difference between ANNs and SNNs. We propose gradient rewiring (Grad R), a joint learning algorithm of connectivity and weight for SNNs, that enables us to seamlessly optimize network structure without retrain.
arXiv Detail & Related papers (2021-05-11T10:05:53Z)
Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks [0.0]
Spiking neural networks (SNNs) are biology-inspired artificial neural networks (ANNs) We propose a novel strategic pipeline that transfers the weights to the target SNN by combining threshold balance and soft-reset mechanisms. Our method is promising to get implanted onto embedded platforms with better support of SNNs with limited energy and memory.
arXiv Detail & Related papers (2021-02-28T12:04:22Z)
You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference to ANN-Level Accuracy [51.861168222799186]
Spiking Neural Networks (SNNs) are a type of neuromorphic, or brain-inspired network. SNNs are sparse, accessing very few weights, and typically only use addition operations instead of the more power-intensive multiply-and-accumulate operations. In this work, we aim to overcome the limitations of TTFS-encoded neuromorphic systems.
arXiv Detail & Related papers (2020-06-03T15:55:53Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.