Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life
- URL: http://arxiv.org/abs/2302.05818v1
- Date: Sat, 11 Feb 2023 23:55:50 GMT
- Title: Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life
- Authors: Tim Whitaker, Darrell Whitley
- Abstract summary: We introduce Synaptic Stripping as a means to combat the dead neuron problem.
By automatically removing problematic connections during training, we can regenerate dead neurons.
We conduct several ablation studies to investigate these dynamics as a function of network width and depth.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Rectified Linear Units (ReLU) are the default choice for activation functions
in deep neural networks. While they demonstrate excellent empirical
performance, ReLU activations can fall victim to the dead neuron problem. In
these cases, the weights feeding into a neuron end up being pushed into a state
where the neuron outputs zero for all inputs. Consequently, the gradient is
also zero for all inputs, which means that the weights which feed into the
neuron cannot update. The neuron is not able to recover from direct back
propagation and model capacity is reduced as those parameters can no longer be
further optimized. Inspired by a neurological process of the same name, we
introduce Synaptic Stripping as a means to combat this dead neuron problem. By
automatically removing problematic connections during training, we can
regenerate dead neurons and significantly improve model capacity and parametric
utilization. Synaptic Stripping is easy to implement and results in sparse
networks that are more efficient than the dense networks they are derived from.
We conduct several ablation studies to investigate these dynamics as a function
of network width and depth and we conduct an exploration of Synaptic Stripping
with Vision Transformers on a variety of benchmark datasets.
Related papers
- WaLiN-GUI: a graphical and auditory tool for neuron-based encoding [73.88751967207419]
Neuromorphic computing relies on spike-based, energy-efficient communication.
We develop a tool to identify suitable configurations for neuron-based encoding of sample-based data into spike trains.
The WaLiN-GUI is provided open source and with documentation.
arXiv Detail & Related papers (2023-10-25T20:34:08Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Improving Spiking Neural Network Accuracy Using Time-based Neurons [0.24366811507669117]
Research on neuromorphic computing systems based on low-power spiking neural networks using analog neurons is in the spotlight.
As technology scales down, analog neurons are difficult to scale, and they suffer from reduced voltage headroom/dynamic range and circuit nonlinearities.
This paper first models the nonlinear behavior of existing current-mirror-based voltage-domain neurons designed in a 28nm process, and show SNN inference accuracy can be severely degraded by the effect of neuron's nonlinearity.
We propose a novel neuron, which processes incoming spikes in the time domain and greatly improves the linearity, thereby improving the inference accuracy compared to the
arXiv Detail & Related papers (2022-01-05T00:24:45Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Training Deep Spiking Auto-encoders without Bursting or Dying Neurons
through Regularization [9.34612743192798]
Spiking neural networks are a promising approach towards next-generation models of the brain in computational neuroscience.
We apply end-to-end learning with membrane potential-based backpropagation to a spiking convolutional auto-encoder.
We show that applying regularization on membrane potential and spiking output successfully avoids both dead and bursting neurons.
arXiv Detail & Related papers (2021-09-22T21:27:40Z) - SeReNe: Sensitivity based Regularization of Neurons for Structured
Sparsity in Neural Networks [13.60023740064471]
SeReNe is a method for learning sparse topologies with a structure.
We define the sensitivity of a neuron as the variation of the network output.
By including the neuron sensitivity in the cost function as a regularization term, we areable to prune neurons with low sensitivity.
arXiv Detail & Related papers (2021-02-07T10:53:30Z) - Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks.
ANV plays as an implicit regularizer of the mutual information between the training data and the learned model.
It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z) - Investigation and Analysis of Hyper and Hypo neuron pruning to
selectively update neurons during Unsupervised Adaptation [8.845660219190298]
Pruning approaches look for low-salient neurons that are less contributive to a model's decision.
This work investigates if pruning approaches are successful in detecting neurons that are either high-salient (mostly active or hyper) or low-salient (barely active or hypo)
It shows that it may be possible to selectively adapt certain neurons (consisting of the hyper and the hypo neurons) first, followed by a full-network fine tuning.
arXiv Detail & Related papers (2020-01-06T19:46:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.