Related papers: Stabilizing Direct Training of Spiking Neural Networks: Membrane Potential Initialization and Threshold-robust Surrogate Gradient

Stabilizing Direct Training of Spiking Neural Networks: Membrane Potential Initialization and Threshold-robust Surrogate Gradient

URL: http://arxiv.org/abs/2511.08708v1
Date: Thu, 13 Nov 2025 01:03:08 GMT
Title: Stabilizing Direct Training of Spiking Neural Networks: Membrane Potential Initialization and Threshold-robust Surrogate Gradient
Authors: Hyunho Kook, Byeongho Yu, Jeong Min Oh, Eunhyeok Park,
Abstract summary: Spiking Neural Networks (SNNs) have demonstrated high-quality outputs even at early timesteps.<n>In this paper, we present two key innovations: MP-Init (Membrane Potential Initialization) and TrSG (Threshold-robust Surrogate Gradient)
Score: 11.229584148105113
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in the direct training of Spiking Neural Networks (SNNs) have demonstrated high-quality outputs even at early timesteps, paving the way for novel energy-efficient AI paradigms. However, the inherent non-linearity and temporal dependencies in SNNs introduce persistent challenges, such as temporal covariate shift (TCS) and unstable gradient flow with learnable neuron thresholds. In this paper, we present two key innovations: MP-Init (Membrane Potential Initialization) and TrSG (Threshold-robust Surrogate Gradient). MP-Init addresses TCS by aligning the initial membrane potential with its stationary distribution, while TrSG stabilizes gradient flow with respect to threshold voltage during training. Extensive experiments validate our approach, achieving state-of-the-art accuracy on both static and dynamic image datasets. The code is available at: https://github.com/kookhh0827/SNN-MP-Init-TRSG

Related papers

SpikingGamma: Surrogate-Gradient Free and Temporally Precise Online Training of Spiking Neural Networks with Smoothed Delays [1.5166105038254163]
Spiking Neural Networks (SNNs) promise energy-efficient, low-latency AI through sparse, event-driven computation.<n>Yet, training SNNs under fine temporal discretization remains a major challenge, hindering both low-latency responsiveness and the mapping of software-trained SNNs to efficient hardware.<n>We show that this SpikingGamma model supports direct error backpropagation without surrogate gradients, can learn fine temporal patterns with minimal spiking in an online manner, and scale feedforward SNNs to complex tasks and benchmarks with competitive accuracy.
arXiv Detail & Related papers (2026-02-02T11:35:16Z)
DS-ATGO: Dual-Stage Synergistic Learning via Forward Adaptive Threshold and Backward Gradient Optimization for Spiking Neural Networks [18.86237064365729]
Brain-inspired spiking neural networks (SNNs) are recognized as a promising avenue for achieving efficient, low-energy neuromorphic computing.<n>We propose a novel dual-stage synergistic learning algorithm that achieves forward adaptive thresholding and backward dynamic SG.<n> Experimental results demonstrate that our method achieves significant performance improvements.
arXiv Detail & Related papers (2025-11-17T06:54:21Z)
A Self-Ensemble Inspired Approach for Effective Training of Binary-Weight Spiking Neural Networks [66.80058515743468]
Training Spiking Neural Networks (SNNs) and Binary Neural Networks (BNNs) is challenging because of the non-differentiable spike generation function.<n>We present a novel perspective on the dynamics of SNNs and their close connection to BNNs through an analysis of the backpropagation process.<n>Specifically, we leverage a structure of multiple shortcuts and a knowledge distillation-based training technique to improve the training of (binary-weight) SNNs.
arXiv Detail & Related papers (2025-08-18T04:11:06Z)
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control [59.65431931190187]
Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision making on neuromorphic hardware.<n>Most continuous control algorithms for continuous control are designed for Artificial Neural Networks (ANNs)<n>We show that this mismatch destabilizes SNN training and degrades performance.<n>We propose a novel proxy target framework to bridge the gap between discrete SNNs and continuous-control algorithms.
arXiv Detail & Related papers (2025-05-30T03:08:03Z)
Adaptive Gradient Learning for Spiking Neural Networks by Exploiting Membrane Potential Dynamics [23.205286200919673]
Brain-inspired spiking neural networks (SNNs) are recognized as a promising avenue for achieving efficient, low-energy neuromorphic computing.<n>As spikes propagate among neurons, the distribution of membrane potential dynamics (MPD) will deviate from the gradient-available interval of fixed SG.<n>Here, we propose adaptive gradient learning for SNNs by exploiting MPD, namely MPD-AGL.
arXiv Detail & Related papers (2025-05-17T06:06:13Z)
Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient [8.516243389583702]
Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features. We propose Masked Surrogate Gradients (MSGs) to balance the effectiveness of training and the sparseness of the gradient, thereby improving the generalization ability of SNNs.
arXiv Detail & Related papers (2024-06-28T04:21:32Z)
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks [3.485537704990941]
Surrogate gradient (SG) strategy is investigated and applied to circumvent this issue and train SNNs from scratch. We propose the parametric surrogate gradient (PSG) method to iteratively update SG and eventually determine an optimal surrogate gradient parameter. Experimental results demonstrate that the proposed methods can be readily integrated with backpropagation through time (BPTT) algorithm.
arXiv Detail & Related papers (2023-04-26T05:02:41Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z)
Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting [29.685909045226847]
Brain-inspired spiking neuron networks (SNNs) have attracted widespread research interest because of their event-driven and energy-efficient characteristics. Current direct training approach with surrogate gradient results in SNNs with poor generalizability. We introduce the temporal efficient training (TET) approach to compensate for the loss of momentum in the gradient descent with SG.
arXiv Detail & Related papers (2022-02-24T08:02:37Z)
On the Impact of Stable Ranks in Deep Nets [3.307203784120635]
We show that stable ranks appear layerwise essentially as linear factors whose effect accumulates exponentially depthwise. Our results imply that stable ranks appear layerwise essentially as linear factors whose effect accumulates exponentially depthwise.
arXiv Detail & Related papers (2021-10-05T20:04:41Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.