Discovering Evolution Strategies via Meta-Black-Box Optimization
- URL: http://arxiv.org/abs/2211.11260v1
- Date: Mon, 21 Nov 2022 08:48:46 GMT
- Title: Discovering Evolution Strategies via Meta-Black-Box Optimization
- Authors: Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valenti
Dallibard, Chris Lu, Satinder Singh, Sebastian Flennerhag
- Abstract summary: We propose to discover effective update rules for evolution strategies via meta-learning.
Our approach employs a search strategy parametrized by a self-attention-based architecture.
We show that it is possible to self-referentially train an evolution strategy from scratch, with the learned update rule used to drive the outer meta-learning loop.
- Score: 23.956974467496345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimizing functions without access to gradients is the remit of black-box
methods such as evolution strategies. While highly general, their learning
dynamics are often times heuristic and inflexible - exactly the limitations
that meta-learning can address. Hence, we propose to discover effective update
rules for evolution strategies via meta-learning. Concretely, our approach
employs a search strategy parametrized by a self-attention-based architecture,
which guarantees the update rule is invariant to the ordering of the candidate
solutions. We show that meta-evolving this system on a small set of
representative low-dimensional analytic optimization problems is sufficient to
discover new evolution strategies capable of generalizing to unseen
optimization problems, population sizes and optimization horizons. Furthermore,
the same learned evolution strategy can outperform established neuroevolution
baselines on supervised and continuous control tasks. As additional
contributions, we ablate the individual neural network components of our
method; reverse engineer the learned strategy into an explicit heuristic form,
which remains highly competitive; and show that it is possible to
self-referentially train an evolution strategy from scratch, with the learned
update rule used to drive the outer meta-learning loop.
Related papers
- Can Learned Optimization Make Reinforcement Learning Less Difficult? [70.5036361852812]
We consider whether learned optimization can help overcome reinforcement learning difficulties.
Our method, Learned Optimization for Plasticity, Exploration and Non-stationarity (OPEN), meta-learns an update rule whose input features and output structure are informed by previously proposed to these difficulties.
arXiv Detail & Related papers (2024-07-09T17:55:23Z) - Solving Deep Reinforcement Learning Tasks with Evolution Strategies and Linear Policy Networks [0.017476232824732776]
This study investigates how Evolution Strategies perform compared to gradient-based deep reinforcement learning methods.
We benchmark both deep policy networks and networks consisting of a single linear layer from observations to actions for three gradient-based methods.
Our results reveal that Evolution Strategies can find effective linear policies for many reinforcement learning benchmark tasks.
arXiv Detail & Related papers (2024-02-10T09:15:21Z) - Meta-Learning Strategies through Value Maximization in Neural Networks [7.285835869818669]
We present a learning effort framework capable of efficiently optimizing control signals on a fully normative objective.
We apply this framework to investigate the effect of approximations in common meta-learning algorithms.
Across settings, we find that control effort is most beneficial when applied to easier aspects of a task early in learning.
arXiv Detail & Related papers (2023-10-30T18:29:26Z) - Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution
Strategies [50.10277748405355]
Noise-Reuse Evolution Strategies (NRES) is a general class of unbiased online evolution strategies methods.
We show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of steps across a variety of applications.
arXiv Detail & Related papers (2023-04-21T17:53:05Z) - Meta Mirror Descent: Optimiser Learning for Fast Convergence [85.98034682899855]
We take a different perspective starting from mirror descent rather than gradient descent, and meta-learning the corresponding Bregman divergence.
Within this paradigm, we formalise a novel meta-learning objective of minimising the regret bound of learning.
Unlike many meta-learned optimisers, it also supports convergence and generalisation guarantees and uniquely does so without requiring validation data.
arXiv Detail & Related papers (2022-03-05T11:41:13Z) - Bootstrapped Meta-Learning [48.017607959109924]
We propose an algorithm that tackles a challenging meta-optimisation problem by letting the meta-learner teach itself.
The algorithm first bootstraps a target from the meta-learner, then optimises the meta-learner by minimising the distance to that target under a chosen (pseudo-)metric.
We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark, improve upon MAML in few-shot learning, and demonstrate how our approach opens up new possibilities.
arXiv Detail & Related papers (2021-09-09T18:29:05Z) - Population-Based Evolution Optimizes a Meta-Learning Objective [0.6091702876917279]
We propose that meta-learning and adaptive evolvability optimize for high performance after a set of learning iterations.
We demonstrate this claim with a simple evolutionary algorithm, Population-Based Meta Learning.
arXiv Detail & Related papers (2021-03-11T03:45:43Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data.
Existing meta-learning approaches only depend on the current task information during the adaptation.
We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z) - Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP Problems [5.23587935428994]
In this paper, we exploit the highly adaptive nature of neuromodulated neural networks to evolve a controller that uses the latent space of an autoencoder in a POMDP.
The integration of inborn knowledge and online plasticity enabled fast adaptation and better performance in comparison to some non-evolutionary meta-reinforcement learning algorithms.
arXiv Detail & Related papers (2020-04-27T14:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.