Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies
- URL: http://arxiv.org/abs/2405.04322v1
- Date: Tue, 7 May 2024 13:48:59 GMT
- Title: Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies
- Authors: Paul Templier, Emmanuel Rachelson, Antoine Cully, Dennis G. Wilson,
- Abstract summary: Genetic Drift Regularization (GDR) is a simple regularization method in the actor training loss that prevents the actor genome from drifting away from the ES.
We show that GDR can improve ES convergence on problems where RL learns well, but also helps RL training on other tasks.
- Score: 9.813386592472535
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Evolutionary Algorithms (EA) have been successfully used for the optimization of neural networks for policy search, but they still remain sample inefficient and underperforming in some cases compared to gradient-based reinforcement learning (RL). Various methods combine the two approaches, many of them training a RL algorithm on data from EA evaluations and injecting the RL actor into the EA population. However, when using Evolution Strategies (ES) as the EA, the RL actor can drift genetically far from the the ES distribution and injection can cause a collapse of the ES performance. Here, we highlight the phenomenon of genetic drift where the actor genome and the ES population distribution progressively drift apart, leading to injection having a negative impact on the ES. We introduce Genetic Drift Regularization (GDR), a simple regularization method in the actor training loss that prevents the actor genome from drifting away from the ES. We show that GDR can improve ES convergence on problems where RL learns well, but also helps RL training on other tasks, , fixes the injection issues better than previous controlled injection methods.
Related papers
- Reinforcement Learning-assisted Evolutionary Algorithm: A Survey and
Research Opportunities [63.258517066104446]
Reinforcement learning integrated as a component in the evolutionary algorithm has demonstrated superior performance in recent years.
We discuss the RL-EA integration method, the RL-assisted strategy adopted by RL-EA, and its applications according to the existing literature.
In the applications of RL-EA section, we also demonstrate the excellent performance of RL-EA on several benchmarks and a range of public datasets.
arXiv Detail & Related papers (2023-08-25T15:06:05Z) - Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms.
It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL.
This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z) - Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution
Strategies [50.10277748405355]
Noise-Reuse Evolution Strategies (NRES) is a general class of unbiased online evolution strategies methods.
We show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of steps across a variety of applications.
arXiv Detail & Related papers (2023-04-21T17:53:05Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Enabling surrogate-assisted evolutionary reinforcement learning via
policy embedding [28.272572839321104]
This paper proposes a PE-SAERL Framework to enable surrogate-assisted evolutionary reinforcement learning via policy embedding.
Empirical results on 5 Atari games show that the proposed method can perform more efficiently than the four state-of-the-art algorithms.
arXiv Detail & Related papers (2023-01-31T02:36:06Z) - Direct Mutation and Crossover in Genetic Algorithms Applied to
Reinforcement Learning Tasks [0.9137554315375919]
This paper will focus on applying neuroevolution using a simple genetic algorithm (GA) to find the weights of a neural network that produce optimally behaving agents.
We present two novel modifications that improve the data efficiency and speed of convergence when compared to the initial implementation.
arXiv Detail & Related papers (2022-01-13T07:19:28Z) - Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited
Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.
Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting.
This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z) - IE-GAN: An Improved Evolutionary Generative Adversarial Network Using a
New Fitness Function and a Generic Crossover Operator [20.100388977505002]
We propose an improved E-GAN framework called IE-GAN, which introduces a new fitness function and a generic crossover operator.
In particular, the proposed fitness function can model the evolutionary process of individuals more accurately.
The crossover operator, which has been commonly adopted in evolutionary algorithms, can enable offspring to imitate the superior gene expression of their parents.
arXiv Detail & Related papers (2021-07-25T13:55:07Z) - Adam revisited: a weighted past gradients perspective [57.54752290924522]
We propose a novel adaptive method weighted adaptive algorithm (WADA) to tackle the non-convergence issues.
We prove that WADA can achieve a weighted data-dependent regret bound, which could be better than the original regret bound of ADAGRAD.
arXiv Detail & Related papers (2021-01-01T14:01:52Z) - Accelerating Reinforcement Learning with a
Directional-Gaussian-Smoothing Evolution Strategy [3.404507240556492]
Evolution strategy (ES) has been shown great promise in many challenging reinforcement learning (RL) tasks.
There are two limitations in the current ES practice that may hinder its otherwise further capabilities.
In this work, we employ a Directional Gaussian Smoothing Evolutionary Strategy (DGS-ES) to accelerate RL training.
We show that DGS-ES is highly scalable, possesses superior wall-clock time, and achieves competitive reward scores to other popular policy gradient and ES approaches.
arXiv Detail & Related papers (2020-02-21T01:05:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.