Related papers: Improving Deep Neural Network Random Initialization Through Neuronal Rewiring

Improving Deep Neural Network Random Initialization Through Neuronal Rewiring

URL: http://arxiv.org/abs/2207.08148v1
Date: Sun, 17 Jul 2022 11:52:52 GMT
Title: Improving Deep Neural Network Random Initialization Through Neuronal Rewiring
Authors: Leonardo Scabini, Bernard De Baets, and Odemir M. Bruno
Abstract summary: We show that a higher neuronal strength variance may decrease performance, while a lower neuronal strength variance usually improves it. A new method is then proposed to rewire neuronal connections according to a preferential attachment (PA) rule based on their strength. In this sense, PA only reorganizes connections, while preserving the magnitude and distribution of the weights.
Score: 14.484787903053208
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The deep learning literature is continuously updated with new architectures and training techniques. However, weight initialization is overlooked by most recent research, despite some intriguing findings regarding random weights. On the other hand, recent works have been approaching Network Science to understand the structure and dynamics of Artificial Neural Networks (ANNs) after training. Therefore, in this work, we analyze the centrality of neurons in randomly initialized networks. We show that a higher neuronal strength variance may decrease performance, while a lower neuronal strength variance usually improves it. A new method is then proposed to rewire neuronal connections according to a preferential attachment (PA) rule based on their strength, which significantly reduces the strength variance of layers initialized by common methods. In this sense, PA rewiring only reorganizes connections, while preserving the magnitude and distribution of the weights. We show through an extensive statistical analysis in image classification that performance is improved in most cases, both during training and testing, when using both simple and complex architectures and learning schedules. Our results show that, aside from the magnitude, the organization of the weights is also relevant for better initialization of deep ANNs.

Related papers

Expressivity of Neural Networks with Random Weights and Learned Biases [44.02417750529102]
We show that feedforward neural networks with fixed random weights can approximate any continuous function on compact sets. Our findings are relevant to neuroscience, where they demonstrate the potential for behaviourally relevant changes in dynamics without modifying synaptic weights, as well as for AI.
arXiv Detail & Related papers (2024-07-01T04:25:49Z)
How connectivity structure shapes rich and lazy learning in neural circuits [14.236853424595333]
We investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Our research highlights the pivotal role of initial weight structures in shaping learning regimes.
arXiv Detail & Related papers (2023-10-12T17:08:45Z)
Self-Expanding Neural Networks [24.812671965904727]
We introduce a natural gradient based approach which intuitively expands both the width and depth of a neural network. We prove an upper bound on the rate'' at which neurons are added, and a computationally cheap lower bound on the expansion score. We illustrate the benefits of such Self-Expanding Neural Networks with full connectivity and convolutions in both classification and regression problems.
arXiv Detail & Related papers (2023-07-10T12:49:59Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
Desire Backpropagation: A Lightweight Training Algorithm for Multi-Layer Spiking Neural Networks based on Spike-Timing-Dependent Plasticity [13.384228628766236]
Spiking neural networks (SNNs) are a viable alternative to conventional artificial neural networks. We present desire backpropagation, a method to derive the desired spike activity of all neurons, including the hidden ones. We trained three-layer networks to classify MNIST and Fashion-MNIST images and reached an accuracy of 98.41% and 87.56%, respectively.
arXiv Detail & Related papers (2022-11-10T08:32:13Z)
Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks. We explore the diversity of the neurons within the hidden layer during the learning process. We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z)
Deep Convolutional Neural Networks with Unitary Weights [0.0]
We show that unitary convolutional neural networks deliver up to 32% faster inference speeds while maintaining competitive prediction accuracy. Unlike prior arts restricted to square synaptic weights, we expand the unitary networks to weights of any size and dimension.
arXiv Detail & Related papers (2021-02-23T18:36:13Z)
Neural networks with late-phase weights [66.72777753269658]
We show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning. At the end of learning, we obtain back a single model by taking a spatial average in weight space.
arXiv Detail & Related papers (2020-07-25T13:23:37Z)
Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks through Network-Aware Adaptation [57.46377517266827]
This work introduces network-aware adaptive structure transfer learning (N-ASTL) N-ASTL utilizes statistical information related to the source network's topology and weight distribution to inform how new input and output neurons are to be integrated into the existing structure. Results show improvements over prior state-of-the-art, including the ability to transfer in challenging real-world datasets not previously possible.
arXiv Detail & Related papers (2020-06-04T06:07:30Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights. Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.