Smooth Exact Gradient Descent Learning in Spiking Neural Networks
- URL: http://arxiv.org/abs/2309.14523v1
- Date: Mon, 25 Sep 2023 20:51:00 GMT
- Title: Smooth Exact Gradient Descent Learning in Spiking Neural Networks
- Authors: Christian Klos, Raoul-Martin Memmesheimer
- Abstract summary: We demonstrate exact gradient descent learning based on spiking dynamics that change only continuously.
Our results show how non-disruptive learning is possible despite discrete spikes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial neural networks are highly successfully trained with
backpropagation. For spiking neural networks, however, a similar gradient
descent scheme seems prohibitive due to the sudden, disruptive (dis-)appearance
of spikes. Here, we demonstrate exact gradient descent learning based on
spiking dynamics that change only continuously. These are generated by neuron
models whose spikes vanish and appear at the end of a trial, where they do not
influence other neurons anymore. This also enables gradient-based spike
addition and removal. We apply our learning scheme to induce and continuously
move spikes to desired times, in single neurons and recurrent networks.
Further, it achieves competitive performance in a benchmark task using deep,
initially silent networks. Our results show how non-disruptive learning is
possible despite discrete spikes.
Related papers
- Topological obstruction to the training of shallow ReLU neural networks [0.0]
We study the interplay between the geometry of the loss landscape and the optimization trajectories of simple neural networks.
This paper reveals the presence of topological obstruction in the loss landscape of shallow ReLU neural networks trained using gradient flow.
arXiv Detail & Related papers (2024-10-18T19:17:48Z) - Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks [15.691263438655842]
Spiking Neural Network (SNN) is a biologically inspired neural network infrastructure that has recently garnered significant attention.
Training an SNN directly poses a challenge due to the undefined gradient of the firing spike process.
We propose a shortcut back-propagation method in our paper, which advocates for transmitting the gradient directly from the loss to the shallow layers.
arXiv Detail & Related papers (2024-01-09T10:54:41Z) - Learning fixed points of recurrent neural networks by reparameterizing
the network model [0.0]
In computational neuroscience, fixed points of recurrent neural networks are commonly used to model neural responses to static or slowly changing stimuli.
A natural approach is to use gradient descent on the Euclidean space of synaptic weights.
We show that this approach can lead to poor learning performance due to singularities that arise in the loss surface.
arXiv Detail & Related papers (2023-07-13T13:09:11Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Theoretical Characterization of How Neural Network Pruning Affects its
Generalization [131.1347309639727]
This work makes the first attempt to study how different pruning fractions affect the model's gradient descent dynamics and generalization.
It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero.
More surprisingly, the generalization bound gets better as the pruning fraction gets larger.
arXiv Detail & Related papers (2023-01-01T03:10:45Z) - Dynamics-aware Adversarial Attack of Adaptive Neural Networks [75.50214601278455]
We investigate the dynamics-aware adversarial attack problem of adaptive neural networks.
We propose a Leaded Gradient Method (LGM) and show the significant effects of the lagged gradient.
Our LGM achieves impressive adversarial attack performance compared with the dynamic-unaware attack methods.
arXiv Detail & Related papers (2022-10-15T01:32:08Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Backward Gradient Normalization in Deep Neural Networks [68.8204255655161]
We introduce a new technique for gradient normalization during neural network training.
The gradients are rescaled during the backward pass using normalization layers introduced at certain points within the network architecture.
Results on tests with very deep neural networks show that the new technique can do an effective control of the gradient norm.
arXiv Detail & Related papers (2021-06-17T13:24:43Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Plateau Phenomenon in Gradient Descent Training of ReLU networks:
Explanation, Quantification and Avoidance [0.0]
In general, neural networks are trained by gradient type optimization methods.
The loss function decreases rapidly at the beginning of training but then, after a relatively small number of steps, significantly slow down.
The present work aims to identify and quantify the root causes of plateau phenomenon.
arXiv Detail & Related papers (2020-07-14T17:33:26Z) - Frosting Weights for Better Continual Training [22.554993259239307]
Training a neural network model can be a lifelong learning process and is a computationally intensive one.
Deep neural network models can suffer from catastrophic forgetting during retraining on new data.
We propose two generic ensemble approaches, gradient boosting and meta-learning, to solve the problem.
arXiv Detail & Related papers (2020-01-07T00:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.