Improving Deep Neural Network Random Initialization Through Neuronal
Rewiring
- URL: http://arxiv.org/abs/2207.08148v1
- Date: Sun, 17 Jul 2022 11:52:52 GMT
- Title: Improving Deep Neural Network Random Initialization Through Neuronal
Rewiring
- Authors: Leonardo Scabini, Bernard De Baets, and Odemir M. Bruno
- Abstract summary: We show that a higher neuronal strength variance may decrease performance, while a lower neuronal strength variance usually improves it.
A new method is then proposed to rewire neuronal connections according to a preferential attachment (PA) rule based on their strength.
In this sense, PA only reorganizes connections, while preserving the magnitude and distribution of the weights.
- Score: 14.484787903053208
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The deep learning literature is continuously updated with new architectures
and training techniques. However, weight initialization is overlooked by most
recent research, despite some intriguing findings regarding random weights. On
the other hand, recent works have been approaching Network Science to
understand the structure and dynamics of Artificial Neural Networks (ANNs)
after training. Therefore, in this work, we analyze the centrality of neurons
in randomly initialized networks. We show that a higher neuronal strength
variance may decrease performance, while a lower neuronal strength variance
usually improves it. A new method is then proposed to rewire neuronal
connections according to a preferential attachment (PA) rule based on their
strength, which significantly reduces the strength variance of layers
initialized by common methods. In this sense, PA rewiring only reorganizes
connections, while preserving the magnitude and distribution of the weights. We
show through an extensive statistical analysis in image classification that
performance is improved in most cases, both during training and testing, when
using both simple and complex architectures and learning schedules. Our results
show that, aside from the magnitude, the organization of the weights is also
relevant for better initialization of deep ANNs.
Related papers
- How connectivity structure shapes rich and lazy learning in neural
circuits [14.236853424595333]
We investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime.
Our research highlights the pivotal role of initial weight structures in shaping learning regimes.
arXiv Detail & Related papers (2023-10-12T17:08:45Z) - Self-Expanding Neural Networks [24.812671965904727]
We introduce a natural gradient based approach which intuitively expands both the width and depth of a neural network.
We prove an upper bound on the rate'' at which neurons are added, and a computationally cheap lower bound on the expansion score.
We illustrate the benefits of such Self-Expanding Neural Networks with full connectivity and convolutions in both classification and regression problems.
arXiv Detail & Related papers (2023-07-10T12:49:59Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Desire Backpropagation: A Lightweight Training Algorithm for Multi-Layer
Spiking Neural Networks based on Spike-Timing-Dependent Plasticity [13.384228628766236]
Spiking neural networks (SNNs) are a viable alternative to conventional artificial neural networks.
We present desire backpropagation, a method to derive the desired spike activity of all neurons, including the hidden ones.
We trained three-layer networks to classify MNIST and Fashion-MNIST images and reached an accuracy of 98.41% and 87.56%, respectively.
arXiv Detail & Related papers (2022-11-10T08:32:13Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Deep Convolutional Neural Networks with Unitary Weights [0.0]
We show that unitary convolutional neural networks deliver up to 32% faster inference speeds while maintaining competitive prediction accuracy.
Unlike prior arts restricted to square synaptic weights, we expand the unitary networks to weights of any size and dimension.
arXiv Detail & Related papers (2021-02-23T18:36:13Z) - Neural networks with late-phase weights [66.72777753269658]
We show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning.
At the end of learning, we obtain back a single model by taking a spatial average in weight space.
arXiv Detail & Related papers (2020-07-25T13:23:37Z) - Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks
through Network-Aware Adaptation [57.46377517266827]
This work introduces network-aware adaptive structure transfer learning (N-ASTL)
N-ASTL utilizes statistical information related to the source network's topology and weight distribution to inform how new input and output neurons are to be integrated into the existing structure.
Results show improvements over prior state-of-the-art, including the ability to transfer in challenging real-world datasets not previously possible.
arXiv Detail & Related papers (2020-06-04T06:07:30Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights.
Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.