Related papers: Smooth Exact Gradient Descent Learning in Spiking Neural Networks

Related papers

Topological obstruction to the training of shallow ReLU neural networks [0.0]
We study the interplay between the geometry of the loss landscape and the optimization trajectories of simple neural networks. This paper reveals the presence of topological obstruction in the loss landscape of shallow ReLU neural networks trained using gradient flow.
arXiv Detail & Related papers (2024-10-18T19:17:48Z)
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks [74.3099028063756]
We develop a new method with neuronal operations based on lateral connections and Hebbian learning. We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities. Our method consistently solves for spiking neural networks with nearly zero forgetting.
arXiv Detail & Related papers (2024-02-19T09:29:37Z)
Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks [15.691263438655842]
Spiking Neural Network (SNN) is a biologically inspired neural network infrastructure that has recently garnered significant attention. Training an SNN directly poses a challenge due to the undefined gradient of the firing spike process. We propose a shortcut back-propagation method in our paper, which advocates for transmitting the gradient directly from the loss to the shallow layers.
arXiv Detail & Related papers (2024-01-09T10:54:41Z)
Learning fixed points of recurrent neural networks by reparameterizing the network model [0.0]
In computational neuroscience, fixed points of recurrent neural networks are commonly used to model neural responses to static or slowly changing stimuli. A natural approach is to use gradient descent on the Euclidean space of synaptic weights. We show that this approach can lead to poor learning performance due to singularities that arise in the loss surface.
arXiv Detail & Related papers (2023-07-13T13:09:11Z)
Spike-based computation using classical recurrent neural networks [1.9171404264679484]
Spiking neural networks are artificial neural networks in which communication between neurons is only made of events, also called spikes. We modify the dynamics of a well-known, easily trainable type of recurrent neural network to make it event-based. We show that this new network can achieve performance comparable to other types of spiking networks in the MNIST benchmark.
arXiv Detail & Related papers (2023-06-06T12:19:12Z)
Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Theoretical Characterization of How Neural Network Pruning Affects its Generalization [131.1347309639727]
This work makes the first attempt to study how different pruning fractions affect the model's gradient descent dynamics and generalization. It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero. More surprisingly, the generalization bound gets better as the pruning fraction gets larger.
arXiv Detail & Related papers (2023-01-01T03:10:45Z)
Dynamics-aware Adversarial Attack of Adaptive Neural Networks [75.50214601278455]
We investigate the dynamics-aware adversarial attack problem of adaptive neural networks. We propose a Leaded Gradient Method (LGM) and show the significant effects of the lagged gradient. Our LGM achieves impressive adversarial attack performance compared with the dynamic-unaware attack methods.
arXiv Detail & Related papers (2022-10-15T01:32:08Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Benign Overfitting in Two-layer Convolutional Neural Networks [90.75603889605043]
We study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN) We show that when the signal-to-noise ratio satisfies a certain condition, a two-layer CNN trained by gradient descent can achieve arbitrarily small training and test loss. On the other hand, when this condition does not hold, overfitting becomes harmful and the obtained CNN can only achieve constant level test loss.
arXiv Detail & Related papers (2022-02-14T07:45:51Z)
Backward Gradient Normalization in Deep Neural Networks [68.8204255655161]
We introduce a new technique for gradient normalization during neural network training. The gradients are rescaled during the backward pass using normalization layers introduced at certain points within the network architecture. Results on tests with very deep neural networks show that the new technique can do an effective control of the gradient norm.
arXiv Detail & Related papers (2021-06-17T13:24:43Z)
Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task. This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
Plateau Phenomenon in Gradient Descent Training of ReLU networks: Explanation, Quantification and Avoidance [0.0]
In general, neural networks are trained by gradient type optimization methods. The loss function decreases rapidly at the beginning of training but then, after a relatively small number of steps, significantly slow down. The present work aims to identify and quantify the root causes of plateau phenomenon.
arXiv Detail & Related papers (2020-07-14T17:33:26Z)
Optimized spiking neurons classify images with high accuracy through temporal coding with two spikes [1.7767466724342065]
Spike-based neuromorphic hardware promises to reduce the energy consumption of image classification and other deep learning applications. Previous methods for converting trained artificial neural networks to spiking neurons were inefficient because the neurons had to emit too many spikes. We show that a substantially more efficient conversion arises when one optimize the spiking neuron model for that purpose.
arXiv Detail & Related papers (2020-01-31T10:11:45Z)
Frosting Weights for Better Continual Training [22.554993259239307]
Training a neural network model can be a lifelong learning process and is a computationally intensive one. Deep neural network models can suffer from catastrophic forgetting during retraining on new data. We propose two generic ensemble approaches, gradient boosting and meta-learning, to solve the problem.
arXiv Detail & Related papers (2020-01-07T00:53:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.