First Steps Towards a Runtime Analysis of Neuroevolution
- URL: http://arxiv.org/abs/2307.00799v2
- Date: Mon, 16 Oct 2023 14:40:37 GMT
- Title: First Steps Towards a Runtime Analysis of Neuroevolution
- Authors: Paul Fischer and Emil Lundt Larsen and Carsten Witt
- Abstract summary: We consider a simple setting in neuroevolution where an evolutionary algorithm optimize the weights and activation functions of a simple artificial neural network.
We then define simple example functions to be learned by the network and conduct rigorous runtime analyses for networks with a single neuron and for a more advanced structure with several neurons and two layers.
Our results show that the proposed algorithm is generally efficient on two example problems designed for one neuron and efficient with at least constant probability on the example problem for a two-layer network.
- Score: 2.07180164747172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a simple setting in neuroevolution where an evolutionary
algorithm optimizes the weights and activation functions of a simple artificial
neural network. We then define simple example functions to be learned by the
network and conduct rigorous runtime analyses for networks with a single neuron
and for a more advanced structure with several neurons and two layers. Our
results show that the proposed algorithm is generally efficient on two example
problems designed for one neuron and efficient with at least constant
probability on the example problem for a two-layer network. In particular, the
so-called harmonic mutation operator choosing steps of size $j$ with
probability proportional to $1/j$ turns out as a good choice for the underlying
search space. However, for the case of one neuron, we also identify situations
with hard-to-overcome local optima. Experimental investigations of our
neuroevolutionary algorithm and a state-of-the-art CMA-ES support the
theoretical findings.
Related papers
- Parallel-in-Time Solutions with Random Projection Neural Networks [0.07282584715927627]
This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator.
We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations.
arXiv Detail & Related papers (2024-08-19T07:32:41Z) - Stochastic Gradient Descent for Two-layer Neural Networks [2.0349026069285423]
This paper presents a study on the convergence rates of the descent (SGD) algorithm when applied to overparameterized two-layer neural networks.
Our approach combines the Tangent Kernel (NTK) approximation with convergence analysis in the Reproducing Kernel Space (RKHS) generated by NTK.
Our research framework enables us to explore the intricate interplay between kernel methods and optimization processes, shedding light on the dynamics and convergence properties of neural networks.
arXiv Detail & Related papers (2024-07-10T13:58:57Z) - Optimized classification with neural ODEs via separability [0.0]
Classification of $N$ points becomes a simultaneous control problem when viewed through the lens of neural ordinary differential equations (neural ODEs)
In this study, we focus on estimating the number of neurons required for efficient cluster-based classification.
We propose a new constructive algorithm that simultaneously classifies clusters of $d$ points from any initial configuration.
arXiv Detail & Related papers (2023-12-21T12:56:40Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - On the Approximation and Complexity of Deep Neural Networks to Invariant
Functions [0.0]
We study the approximation and complexity of deep neural networks to invariant functions.
We show that a broad range of invariant functions can be approximated by various types of neural network models.
We provide a feasible application that connects the parameter estimation and forecasting of high-resolution signals with our theoretical conclusions.
arXiv Detail & Related papers (2022-10-27T09:19:19Z) - Acceleration techniques for optimization over trained neural network
ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation.
We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies.
We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.