Related papers: Discovering Evolution Strategies via Meta-Black-Box Optimization

Discovering Evolution Strategies via Meta-Black-Box Optimization

URL: http://arxiv.org/abs/2211.11260v1
Date: Mon, 21 Nov 2022 08:48:46 GMT
Title: Discovering Evolution Strategies via Meta-Black-Box Optimization
Authors: Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valenti Dallibard, Chris Lu, Satinder Singh, Sebastian Flennerhag
Abstract summary: We propose to discover effective update rules for evolution strategies via meta-learning. Our approach employs a search strategy parametrized by a self-attention-based architecture. We show that it is possible to self-referentially train an evolution strategy from scratch, with the learned update rule used to drive the outer meta-learning loop.
Score: 23.956974467496345
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Optimizing functions without access to gradients is the remit of black-box methods such as evolution strategies. While highly general, their learning dynamics are often times heuristic and inflexible - exactly the limitations that meta-learning can address. Hence, we propose to discover effective update rules for evolution strategies via meta-learning. Concretely, our approach employs a search strategy parametrized by a self-attention-based architecture, which guarantees the update rule is invariant to the ordering of the candidate solutions. We show that meta-evolving this system on a small set of representative low-dimensional analytic optimization problems is sufficient to discover new evolution strategies capable of generalizing to unseen optimization problems, population sizes and optimization horizons. Furthermore, the same learned evolution strategy can outperform established neuroevolution baselines on supervised and continuous control tasks. As additional contributions, we ablate the individual neural network components of our method; reverse engineer the learned strategy into an explicit heuristic form, which remains highly competitive; and show that it is possible to self-referentially train an evolution strategy from scratch, with the learned update rule used to drive the outer meta-learning loop.

Related papers

Can Learned Optimization Make Reinforcement Learning Less Difficult? [70.5036361852812]
We consider whether learned optimization can help overcome reinforcement learning difficulties. Our method, Learned Optimization for Plasticity, Exploration and Non-stationarity (OPEN), meta-learns an update rule whose input features and output structure are informed by previously proposed to these difficulties.
arXiv Detail & Related papers (2024-07-09T17:55:23Z)
Solving Deep Reinforcement Learning Tasks with Evolution Strategies and Linear Policy Networks [0.017476232824732776]
This study investigates how Evolution Strategies perform compared to gradient-based deep reinforcement learning methods. We benchmark both deep policy networks and networks consisting of a single linear layer from observations to actions for three gradient-based methods. Our results reveal that Evolution Strategies can find effective linear policies for many reinforcement learning benchmark tasks.
arXiv Detail & Related papers (2024-02-10T09:15:21Z)
Meta-Learning Strategies through Value Maximization in Neural Networks [7.285835869818669]
We present a learning effort framework capable of efficiently optimizing control signals on a fully normative objective. We apply this framework to investigate the effect of approximations in common meta-learning algorithms. Across settings, we find that control effort is most beneficial when applied to easier aspects of a task early in learning.
arXiv Detail & Related papers (2023-10-30T18:29:26Z)
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies [50.10277748405355]
Noise-Reuse Evolution Strategies (NRES) is a general class of unbiased online evolution strategies methods. We show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of steps across a variety of applications.
arXiv Detail & Related papers (2023-04-21T17:53:05Z)
Meta Mirror Descent: Optimiser Learning for Fast Convergence [85.98034682899855]
We take a different perspective starting from mirror descent rather than gradient descent, and meta-learning the corresponding Bregman divergence. Within this paradigm, we formalise a novel meta-learning objective of minimising the regret bound of learning. Unlike many meta-learned optimisers, it also supports convergence and generalisation guarantees and uniquely does so without requiring validation data.
arXiv Detail & Related papers (2022-03-05T11:41:13Z)
Bootstrapped Meta-Learning [48.017607959109924]
We propose an algorithm that tackles a challenging meta-optimisation problem by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimises the meta-learner by minimising the distance to that target under a chosen (pseudo-)metric. We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark, improve upon MAML in few-shot learning, and demonstrate how our approach opens up new possibilities.
arXiv Detail & Related papers (2021-09-09T18:29:05Z)
Population-Based Evolution Optimizes a Meta-Learning Objective [0.6091702876917279]
We propose that meta-learning and adaptive evolvability optimize for high performance after a set of learning iterations. We demonstrate this claim with a simple evolutionary algorithm, Population-Based Meta Learning.
arXiv Detail & Related papers (2021-03-11T03:45:43Z)
Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK) Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data. Existing meta-learning approaches only depend on the current task information during the adaptation. We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z)
Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP Problems [5.23587935428994]
In this paper, we exploit the highly adaptive nature of neuromodulated neural networks to evolve a controller that uses the latent space of an autoencoder in a POMDP. The integration of inborn knowledge and online plasticity enabled fast adaptation and better performance in comparison to some non-evolutionary meta-reinforcement learning algorithms.
arXiv Detail & Related papers (2020-04-27T14:55:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.