Towards Lossless ANN-SNN Conversion under Ultra-Low Latency with Dual-Phase Optimization
- URL: http://arxiv.org/abs/2205.07473v3
- Date: Tue, 19 Mar 2024 17:25:14 GMT
- Title: Towards Lossless ANN-SNN Conversion under Ultra-Low Latency with Dual-Phase Optimization
- Authors: Ziming Wang, Shuang Lian, Yuhao Zhang, Xiaoxin Cui, Rui Yan, Huajin Tang,
- Abstract summary: Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation.
A popular approach for implementing deep SNNs is ANN-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs.
In this paper, we first identify that such performance degradation stems from the misrepresentation of the negative or overflow residual membrane potential in SNNs.
Inspired by this, we decompose the conversion error into three parts: quantization error, clipping error, and residual membrane potential representation error.
- Score: 30.098268054714048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation. A popular approach for implementing deep SNNs is ANN-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs. However, the accuracy loss is usually non-negligible, especially under a few time steps, which restricts the applications of SNN on latency-sensitive edge devices greatly. In this paper, we first identify that such performance degradation stems from the misrepresentation of the negative or overflow residual membrane potential in SNNs. Inspired by this, we decompose the conversion error into three parts: quantization error, clipping error, and residual membrane potential representation error. With such insights, we propose a two-stage conversion algorithm to minimize those errors respectively. Besides, We show each stage achieves significant performance gains in a complementary manner. By evaluating on challenging datasets including CIFAR-10, CIFAR- 100 and ImageNet, the proposed method demonstrates the state-of-the-art performance in terms of accuracy, latency and energy preservation. Furthermore, our method is evaluated using a more challenging object detection task, revealing notable gains in regression performance under ultra-low latency when compared to existing spike-based detection algorithms. Codes are available at https://github.com/Windere/snn-cvt-dual-phase.
Related papers
- Low Latency of object detection for spikng neural network [3.404826786562694]
Spiking Neural Networks are well-suited for edge AI applications due to their binary spike nature.
In this paper, we focus on generating highly accurate and low-latency SNNs specifically for object detection.
arXiv Detail & Related papers (2023-09-27T10:26:19Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - Fast-SNN: Fast Spiking Neural Network by Converting Quantized ANN [38.18008827711246]
Spiking neural networks (SNNs) have shown advantages in computation and energy efficiency.
It remains a challenge to train deep SNNs due to the discrete spike function.
This paper proposes Fast-SNN that achieves high performance with low latency.
arXiv Detail & Related papers (2023-05-31T14:04:41Z) - Bridging the Gap between ANNs and SNNs by Calibrating Offset Spikes [19.85338979292052]
Spiking Neural Networks (SNNs) have attracted great attention due to their distinctive characteristics of low power consumption and temporal information processing.
ANN-SNN conversion, as the most commonly used training method for applying SNNs, can ensure that converted SNNs achieve comparable performance to ANNs on large-scale datasets.
In this paper, instead of evaluating different conversion errors and then eliminating these errors, we define an offset spike to measure the degree of deviation between actual and desired SNN firing rates.
arXiv Detail & Related papers (2023-02-21T14:10:56Z) - Reducing ANN-SNN Conversion Error through Residual Membrane Potential [19.85338979292052]
Spiking Neural Networks (SNNs) have received extensive academic attention due to the unique properties of low power consumption and high-speed computing on neuromorphic chips.
In this paper, we make a detailed analysis of unevenness error and divide it into four categories.
We propose an optimization strategy based on residual membrane potential to reduce unevenness error.
arXiv Detail & Related papers (2023-02-04T04:44:31Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs.
AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z) - Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking
Neural Networks? [3.2108350580418166]
Spiking neural networks (SNNs) operate via binary spikes distributed over time.
SOTA training strategies for SNNs involve conversion from a non-spiking deep neural network (DNN)
We propose a new training algorithm that accurately captures these distributions, minimizing the error between the DNN and converted SNN.
arXiv Detail & Related papers (2021-12-22T18:47:45Z) - FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.
In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2.
We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.