An Experimental Study of Weight Initialization and Weight Inheritance
Effects on Neuroevolution
- URL: http://arxiv.org/abs/2009.09644v2
- Date: Sat, 26 Sep 2020 21:41:17 GMT
- Title: An Experimental Study of Weight Initialization and Weight Inheritance
Effects on Neuroevolution
- Authors: Zimeng Lyu, AbdElRahman ElSaid, Joshua Karns, Mohamed Mkaouer, Travis
Desell
- Abstract summary: In neuroevolution, weights typically need to be exploding at three different times: when initial genomes (ANNs) are created at the beginning of the search, when offspring genomes are generated by crossover, and when new nodes or edges are created during mutation.
This work explores the difference between using Xavier, Kaiming, and uniform random weight initialization methods, as well as novel Lamarckian weight inheritance methods for initializing new weights during crossover and mutation operations.
- Score: 2.3274138116397736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weight initialization is critical in being able to successfully train
artificial neural networks (ANNs), and even more so for recurrent neural
networks (RNNs) which can easily suffer from vanishing and exploding gradients.
In neuroevolution, where evolutionary algorithms are applied to neural
architecture search, weights typically need to be initialized at three
different times: when initial genomes (ANN architectures) are created at the
beginning of the search, when offspring genomes are generated by crossover, and
when new nodes or edges are created during mutation. This work explores the
difference between using Xavier, Kaiming, and uniform random weight
initialization methods, as well as novel Lamarckian weight inheritance methods
for initializing new weights during crossover and mutation operations. These
are examined using the Evolutionary eXploration of Augmenting Memory Models
(EXAMM) neuroevolution algorithm, which is capable of evolving RNNs with a
variety of modern memory cells (e.g., LSTM, GRU, MGU, UGRNN and Delta-RNN
cells) as well recurrent connections with varying time skips through a high
performance island based distributed evolutionary algorithm. Results show that
with statistical significance, utilizing the Lamarckian strategies outperforms
Kaiming, Xavier and uniform random weight initialization, and can speed
neuroevolution by requiring less backpropagation epochs to be evaluated for
each generated RNN.
Related papers
- Learning Internal Biological Neuron Parameters and Complexity-Based Encoding for Improved Spiking Neural Networks Performance [0.0]
This study introduces a novel approach by replacing the traditional perceptron model with a biologically inspired probabilistic meta neuron model.<n>As a second key contribution, we present a new biologically inspired classification framework that uniquely integrates SNNs with Le-Ziv plasticity (LZC)<n>We consider learning algorithms such as backpropagation, spike-timing aspect-dependent plasticity (STDP), and the Tempotron learning rule.
arXiv Detail & Related papers (2025-08-08T09:14:49Z) - Neuro-Evolutionary Approach to Physics-Aware Symbolic Regression [0.0]
We propose a neuro-evolutionary symbolic regression method that combines evolutionary-based search for optimal neural network topologies with gradient-based tuning of the network's parameters.
Our method employs a memory-based strategy and population perturbations to enhance exploitation and reduce the risk of being trapped in suboptimal NNs.
arXiv Detail & Related papers (2025-04-23T08:29:53Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - Discovering Physics-Informed Neural Networks Model for Solving Partial Differential Equations through Evolutionary Computation [5.8407437499182935]
This article proposes an evolutionary computation method aimed at discovering the PINNs model with higher approximation accuracy and faster convergence rate.
In experiments, the performance of different models that are searched through Bayesian optimization, random search and evolution is compared in solving Klein-Gordon, Burgers, and Lam'e equations.
arXiv Detail & Related papers (2024-05-18T07:32:02Z) - Neural Functional Transformers [99.98750156515437]
This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers called neural functional Transformers (NFTs)
NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains.
We also leverage NFTs to develop Inr2Array, a novel method for computing permutation invariant representations from the weights of implicit neural representations (INRs)
arXiv Detail & Related papers (2023-05-22T23:38:27Z) - How (Implicit) Regularization of ReLU Neural Networks Characterizes the
Learned Function -- Part II: the Multi-D Case of Two Layers with Random First
Layer [2.1485350418225244]
We give an exact macroscopic characterization of the generalization behavior of randomized, shallow NNs with ReLU activation.
We show that RSNs correspond to a generalized additive model (GAM)-typed regression in which infinitely many directions are considered.
arXiv Detail & Related papers (2023-03-20T21:05:47Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Neuroevolution of Physics-Informed Neural Nets: Benchmark Problems and
Comparative Results [25.12291688711645]
Physics-informed neural networks (PINNs) are one of the key techniques at the forefront of recent advances.
PINNs' unique loss formulations lead to a high degree of complexity and ruggedness that may not be conducive for gradient descent.
Neuroevolution algorithms, with their superior global search capacity, may be a better choice for PINNs.
arXiv Detail & Related papers (2022-12-15T05:54:16Z) - Improving Deep Neural Network Random Initialization Through Neuronal
Rewiring [14.484787903053208]
We show that a higher neuronal strength variance may decrease performance, while a lower neuronal strength variance usually improves it.
A new method is then proposed to rewire neuronal connections according to a preferential attachment (PA) rule based on their strength.
In this sense, PA only reorganizes connections, while preserving the magnitude and distribution of the weights.
arXiv Detail & Related papers (2022-07-17T11:52:52Z) - Direct Mutation and Crossover in Genetic Algorithms Applied to
Reinforcement Learning Tasks [0.9137554315375919]
This paper will focus on applying neuroevolution using a simple genetic algorithm (GA) to find the weights of a neural network that produce optimally behaving agents.
We present two novel modifications that improve the data efficiency and speed of convergence when compared to the initial implementation.
arXiv Detail & Related papers (2022-01-13T07:19:28Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Epigenetic evolution of deep convolutional models [81.21462458089142]
We build upon a previously proposed neuroevolution framework to evolve deep convolutional models.
We propose a convolutional layer layout which allows kernels of different shapes and sizes to coexist within the same layer.
The proposed layout enables the size and shape of individual kernels within a convolutional layer to be evolved with a corresponding new mutation operator.
arXiv Detail & Related papers (2021-04-12T12:45:16Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.