Related papers: A Closer Look at Knowledge Distillation in Spiking Neural Network Training

A Closer Look at Knowledge Distillation in Spiking Neural Network Training

URL: http://arxiv.org/abs/2511.06902v2
Date: Fri, 14 Nov 2025 07:58:32 GMT
Title: A Closer Look at Knowledge Distillation in Spiking Neural Network Training
Authors: Xu Liu, Na Xia, Jinxing Zhou, Jingyuan Xu, Dan Guo,
Abstract summary: Spiking Neural Networks (SNNs) become popular due to excellent energy efficiency, yet facing challenges for effective model training.<n>Recent works improve this by introducing knowledge distillation (KD) techniques, with the pre-trained artificial neural networks (ANNs) used as teachers and the target SNNs as students.<n>This is commonly accomplished through a straightforward element-wise alignment of intermediate features and prediction logits from ANNs and SNNs, often neglecting the intrinsic differences between their architectures.
Score: 28.455940908397945
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Spiking Neural Networks (SNNs) become popular due to excellent energy efficiency, yet facing challenges for effective model training. Recent works improve this by introducing knowledge distillation (KD) techniques, with the pre-trained artificial neural networks (ANNs) used as teachers and the target SNNs as students. This is commonly accomplished through a straightforward element-wise alignment of intermediate features and prediction logits from ANNs and SNNs, often neglecting the intrinsic differences between their architectures. Specifically, ANN's outputs exhibit a continuous distribution, whereas SNN's outputs are characterized by sparsity and discreteness. To mitigate this issue, we introduce two innovative KD strategies. Firstly, we propose the Saliency-scaled Activation Map Distillation (SAMD), which aligns the spike activation map of the student SNN with the class-aware activation map of the teacher ANN. Rather than performing KD directly on the raw %and distinct features of ANN and SNN, our SAMD directs the student to learn from saliency activation maps that exhibit greater semantic and distribution consistency. Additionally, we propose a Noise-smoothed Logits Distillation (NLD), which utilizes Gaussian noise to smooth the sparse logits of student SNN, facilitating the alignment with continuous logits from teacher ANN. Extensive experiments on multiple datasets demonstrate the effectiveness of our methods. Code is available~\footnote{https://github.com/SinoLeu/CKDSNN.git}.

Related papers

A Self-Ensemble Inspired Approach for Effective Training of Binary-Weight Spiking Neural Networks [66.80058515743468]
Training Spiking Neural Networks (SNNs) and Binary Neural Networks (BNNs) is challenging because of the non-differentiable spike generation function.<n>We present a novel perspective on the dynamics of SNNs and their close connection to BNNs through an analysis of the backpropagation process.<n>Specifically, we leverage a structure of multiple shortcuts and a knowledge distillation-based training technique to improve the training of (binary-weight) SNNs.
arXiv Detail & Related papers (2025-08-18T04:11:06Z)
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control [59.65431931190187]
Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision making on neuromorphic hardware.<n>Most continuous control algorithms for continuous control are designed for Artificial Neural Networks (ANNs)<n>We show that this mismatch destabilizes SNN training and degrades performance.<n>We propose a novel proxy target framework to bridge the gap between discrete SNNs and continuous-control algorithms.
arXiv Detail & Related papers (2025-05-30T03:08:03Z)
Bridge the Gap between SNN and ANN for Image Restoration [7.487270862599671]
Currently, neural networks based on the SNN (Spiking Neural Network) framework are beginning to make their mark in the field of image restoration.<n>We propose a novel distillation technique, called asymmetric framework (ANN-SNN) distillation, in which the teacher is an ANN and the student is an SNN.<n>Specifically, we leverage the intermediate features (feature maps) learned by the ANN as hints to guide the training process of the SNN.
arXiv Detail & Related papers (2025-04-02T14:12:06Z)
Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement [3.4776100606469096]
Spiking Neural Networks (SNNs) have garnered considerable attention as a potential alternative to Artificial Neural Networks (ANNs)<n>Recent studies have highlighted SNNs' potential on large-scale datasets.
arXiv Detail & Related papers (2025-03-20T09:04:38Z)
Self-Distillation Learning Based on Temporal-Spatial Consistency for Spiking Neural Networks [3.7748662901422807]
Spiking neural networks (SNNs) have attracted considerable attention for their event-driven, low-power characteristics and high biological interpretability. Recent research has improved the performance of the SNN model with a pre-trained teacher model. In this paper, we explore cost-effective self-distillation learning of SNNs to circumvent these concerns.
arXiv Detail & Related papers (2024-06-12T04:30:40Z)
Joint A-SNN: Joint Training of Artificial and Spiking Neural Networks via Self-Distillation and Weight Factorization [12.1610509770913]
Spiking Neural Networks (SNNs) mimic the spiking nature of brain neurons. We propose a joint training framework of ANN and SNN, in which the ANN can guide the SNN's optimization. Our method consistently outperforms many other state-of-the-art training methods.
arXiv Detail & Related papers (2023-05-03T13:12:17Z)
LaSNN: Layer-wise ANN-to-SNN Distillation for Effective and Efficient Training in Deep Spiking Neural Networks [7.0691139514420005]
Spiking Neural Networks (SNNs) are biologically realistic and practically promising in low-power because of their event-driven mechanism. A conversion scheme is proposed to obtain competitive accuracy by mapping trained ANNs' parameters to SNNs with the same structures. A novel SNN training framework is proposed, namely layer-wise ANN-to-SNN knowledge distillation (LaSNN)
arXiv Detail & Related papers (2023-04-17T03:49:35Z)
A Hybrid ANN-SNN Architecture for Low-Power and Low-Latency Visual Perception [27.144985031646932]
Spiking Neural Networks (SNN) are a class of bio-inspired neural networks that promise to bring low-power and low-latency inference to edge devices. We show for the task of event-based 2D and 3D human pose estimation that our method consumes 88% less power with only a 4% decrease in performance compared to its fully ANN counterparts.
arXiv Detail & Related papers (2023-03-24T17:38:45Z)
SNN2ANN: A Fast and Memory-Efficient Training Framework for Spiking Neural Networks [117.56823277328803]
Spiking neural networks are efficient computation models for low-power environments. We propose a SNN-to-ANN (SNN2ANN) framework to train the SNN in a fast and memory-efficient way. Experiment results show that our SNN2ANN-based models perform well on the benchmark datasets.
arXiv Detail & Related papers (2022-06-19T16:52:56Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Kernel Based Progressive Distillation for Adder Neural Networks [71.731127378807]
Adder Neural Networks (ANNs) which only contain additions bring us a new way of developing deep neural networks with low energy consumption. There is an accuracy drop when replacing all convolution filters by adder filters. We present a novel method for further improving the performance of ANNs without increasing the trainable parameters.
arXiv Detail & Related papers (2020-09-28T03:29:19Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.