Sparse Mutation Decompositions: Fine Tuning Deep Neural Networks with
Subspace Evolution
- URL: http://arxiv.org/abs/2302.05832v1
- Date: Sun, 12 Feb 2023 01:27:26 GMT
- Title: Sparse Mutation Decompositions: Fine Tuning Deep Neural Networks with
Subspace Evolution
- Authors: Tim Whitaker, Darrell Whitley
- Abstract summary: A popular subclass of neuroevolutionary methods, called evolution strategies, relies on dense noise perturbations to mutate networks.
We introduce an approach to alleviating this problem by decomposing dense mutations into low-dimensional subspaces.
We conduct the first large scale exploration of neuroevolutionary fine tuning and ensembling on the notoriously difficult ImageNet dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neuroevolution is a promising area of research that combines evolutionary
algorithms with neural networks. A popular subclass of neuroevolutionary
methods, called evolution strategies, relies on dense noise perturbations to
mutate networks, which can be sample inefficient and challenging for large
models with millions of parameters. We introduce an approach to alleviating
this problem by decomposing dense mutations into low-dimensional subspaces.
Restricting mutations in this way can significantly reduce variance as networks
can handle stronger perturbations while maintaining performance, which enables
a more controlled and targeted evolution of deep networks. This approach is
uniquely effective for the task of fine tuning pre-trained models, which is an
increasingly valuable area of research as networks continue to scale in size
and open source models become more widely available. Furthermore, we show how
this work naturally connects to ensemble learning where sparse mutations
encourage diversity among children such that their combined predictions can
reliably improve performance. We conduct the first large scale exploration of
neuroevolutionary fine tuning and ensembling on the notoriously difficult
ImageNet dataset, where we see small generalization improvements with only a
single evolutionary generation using nearly a dozen different deep neural
network architectures.
Related papers
- Message Passing Variational Autoregressive Network for Solving Intractable Ising Models [6.261096199903392]
Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks.
Here we propose a variational autoregressive architecture with a message passing mechanism, which can effectively utilize the interactions between spin variables.
The new network trained under an annealing framework outperforms existing methods in solving several prototypical Ising spin Hamiltonians, especially for larger spin systems at low temperatures.
arXiv Detail & Related papers (2024-04-09T11:27:07Z) - AD-NEv++ : The multi-architecture neuroevolution-based multivariate anomaly detection framework [0.794682109939797]
Anomaly detection tools and methods enable key analytical capabilities in modern cyberphysical and sensor-based systems.
We propose AD-NEv++, a three-stage neuroevolution-based method that synergically combines subspace evolution, model evolution, and fine-tuning.
We show that AD-NEv++ can improve and outperform the state-of-the-art GNN (Graph Neural Networks) model architecture in all anomaly detection benchmarks.
arXiv Detail & Related papers (2024-03-25T08:40:58Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Spiking Generative Adversarial Network with Attention Scoring Decoding [4.5727987473456055]
Spiking neural networks offer a closer approximation to brain-like processing.
We build a spiking generative adversarial network capable of handling complex images.
arXiv Detail & Related papers (2023-05-17T14:35:45Z) - Multiobjective Evolutionary Pruning of Deep Neural Networks with
Transfer Learning for improving their Performance and Robustness [15.29595828816055]
This work proposes MO-EvoPruneDeepTL, a multi-objective evolutionary pruning algorithm.
We use Transfer Learning to adapt the last layers of Deep Neural Networks, by replacing them with sparse layers evolved by a genetic algorithm.
Experiments show that our proposal achieves promising results in all the objectives, and direct relation are presented.
arXiv Detail & Related papers (2023-02-20T19:33:38Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - An Artificial Neural Network Functionalized by Evolution [2.0625936401496237]
We propose a hybrid model which combines the tensor calculus of feed-forward neural networks with Pseudo-Darwinian mechanisms.
This allows for finding topologies that are well adapted for elaboration of strategies, control problems or pattern recognition tasks.
In particular, the model can provide adapted topologies at early evolutionary stages, and'structural convergence', which can found applications in robotics, big-data and artificial life.
arXiv Detail & Related papers (2022-05-16T14:49:58Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Epigenetic evolution of deep convolutional models [81.21462458089142]
We build upon a previously proposed neuroevolution framework to evolve deep convolutional models.
We propose a convolutional layer layout which allows kernels of different shapes and sizes to coexist within the same layer.
The proposed layout enables the size and shape of individual kernels within a convolutional layer to be evolved with a corresponding new mutation operator.
arXiv Detail & Related papers (2021-04-12T12:45:16Z) - EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation using
Accelerated Neuroevolution with Weight Transfer [82.28607779710066]
We explore the application of neuroevolution, a form of neural architecture search inspired by biological evolution, in the design of 2D human pose networks.
Our method produces network designs that are more efficient and more accurate than state-of-the-art hand-designed networks.
arXiv Detail & Related papers (2020-11-17T05:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.