Related papers: Evolution Transformer: In-Context Evolutionary Optimization

Evolution Transformer: In-Context Evolutionary Optimization

URL: http://arxiv.org/abs/2403.02985v1
Date: Tue, 5 Mar 2024 14:04:13 GMT
Title: Evolution Transformer: In-Context Evolutionary Optimization
Authors: Robert Tjarko Lange, Yingtao Tian, Yujin Tang
Abstract summary: We introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies. We train the model weights using Evolutionary Algorithm Distillation, a technique for supervised optimization of sequence models. We analyze the resulting properties of the Evolution Transformer and propose a technique to fully self-referentially train the Evolution Transformer.
Score: 6.873777465945062
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Evolutionary optimization algorithms are often derived from loose biological analogies and struggle to leverage information obtained during the sequential course of optimization. An alternative promising approach is to leverage data and directly discover powerful optimization principles via meta-optimization. In this work, we follow such a paradigm and introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies. Given a trajectory of evaluations and search distribution statistics, Evolution Transformer outputs a performance-improving update to the search distribution. The architecture imposes a set of suitable inductive biases, i.e. the invariance of the distribution update to the order of population members within a generation and equivariance to the order of the search dimensions. We train the model weights using Evolutionary Algorithm Distillation, a technique for supervised optimization of sequence models using teacher algorithm trajectories. The resulting model exhibits strong in-context optimization performance and shows strong generalization capabilities to otherwise challenging neuroevolution tasks. We analyze the resulting properties of the Evolution Transformer and propose a technique to fully self-referentially train the Evolution Transformer, starting from a random initialization and bootstrapping its own learning progress. We provide an open source implementation under https://github.com/RobertTLange/evosax.

Related papers

Tensor Network Estimation of Distribution Algorithms [0.0]
Methods integrating tensor networks into evolutionary optimization algorithms have appeared in the recent literature. We find that optimization performance of these methods is not related to the power of the generative model in a straightforward way. In light of this we find that adding an explicit mutation operator to the output of the generative model often improves optimization performance.
arXiv Detail & Related papers (2024-12-27T18:22:47Z)
Heuristically Adaptive Diffusion-Model Evolutionary Strategy [1.8299322342860518]
Diffusion Models represent a significant advancement in generative modeling. Our research reveals a fundamental connection between diffusion models and evolutionary algorithms. Our framework marks a major algorithmic transition, offering increased flexibility, precision, and control in evolutionary optimization processes.
arXiv Detail & Related papers (2024-11-20T16:06:28Z)
Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis [63.66763657191476]
We show that efficient numerical training and inference algorithms as low-rank computation have impressive performance for learning Transformer-based adaption. We analyze how magnitude-based models affect generalization while improving adaption. We conclude that proper magnitude-based has a slight on the testing performance.
arXiv Detail & Related papers (2024-06-24T23:00:58Z)
Uncovering mesa-optimization algorithms in Transformers [61.06055590704677]
Some autoregressive models can learn as an input sequence is processed, without undergoing any parameter changes, and without being explicitly trained to do so. We show that standard next-token prediction error minimization gives rise to a subsidiary learning algorithm that adjusts the model as new inputs are revealed. Our findings explain in-context learning as a product of autoregressive loss minimization and inform the design of new optimization-based Transformer layers.
arXiv Detail & Related papers (2023-09-11T22:42:50Z)
Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability [0.0]
We study gradient descent (GD)-based sparse training and evolution strategies (ES) We find that ES explore diverse and flat local optima and do not preserve linear mode connectivity across sparsity levels and independent runs.
arXiv Detail & Related papers (2023-05-31T15:58:54Z)
Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches. This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications. The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate. There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z)
evosax: JAX-based Evolution Strategies [0.0]
We release evosax: a JAX-based library of evolutionary optimization algorithms. evosax implements 30 evolutionary optimization algorithms including finite-difference-based, estimation-of-distribution evolution strategies and various genetic algorithms. It is designed in a modular fashion and allows for flexible usage via a simple ask-evaluate-tell API.
arXiv Detail & Related papers (2022-12-08T10:34:42Z)
Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation. A linear-complexity recurrent variant has proven well suited for autoregressive generation. This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z)
Evolutionary Variational Optimization of Generative Models [0.0]
We combine two popular optimization approaches to derive learning algorithms for generative models: variational optimization and evolutionary algorithms. We show that evolutionary algorithms can effectively and efficiently optimize the variational bound. In the category of "zero-shot" learning, we observed the evolutionary variational algorithm to significantly improve the state-of-the-art in many benchmark settings.
arXiv Detail & Related papers (2020-12-22T19:06:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.