Related papers: Evolutionary Reinforcement Learning via Cooperative Coevolution

Evolutionary Reinforcement Learning via Cooperative Coevolution

URL: http://arxiv.org/abs/2404.14763v3
Date: Thu, 1 Aug 2024 13:35:22 GMT
Title: Evolutionary Reinforcement Learning via Cooperative Coevolution
Authors: Chengpeng Hu, Jialin Liu, Xin Yao,
Abstract summary: This paper proposes a novel cooperative coevolutionary reinforcement learning (CoERL) algorithm. Inspired by cooperative coevolution, CoERL periodically and adaptively decomposes the policy optimisation problem into multiple subproblems. Instead of using genetic operators, CoERL directly searches for partial gradients to update the policy. Experiments on six benchmark locomotion tasks demonstrate that CoERL outperforms seven state-of-the-art algorithms and baselines.
Score: 4.9267335834028625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, evolutionary reinforcement learning has obtained much attention in various domains. Maintaining a population of actors, evolutionary reinforcement learning utilises the collected experiences to improve the behaviour policy through efficient exploration. However, the poor scalability of genetic operators limits the efficiency of optimising high-dimensional neural networks.To address this issue, this paper proposes a novel cooperative coevolutionary reinforcement learning (CoERL) algorithm. Inspired by cooperative coevolution, CoERL periodically and adaptively decomposes the policy optimisation problem into multiple subproblems and evolves a population of neural networks for each of the subproblems. Instead of using genetic operators, CoERL directly searches for partial gradients to update the policy. Updating policy with partial gradients maintains consistency between the behaviour spaces of parents and offspring across generations.The experiences collected by the population are then used to improve the entire policy, which enhances the sampling efficiency.Experiments on six benchmark locomotion tasks demonstrate that CoERL outperforms seven state-of-the-art algorithms and baselines.Ablation study verifies the unique contribution of CoERL's core ingredients.

Related papers

Synergizing Reinforcement Learning and Genetic Algorithms for Neural Combinatorial Optimization [25.633698252033756]
We propose the Evolutionary Augmentation Mechanism (EAM) to synergize the learning efficiency of DRL with the global search power of GAs.<n>EAM operates by generating solutions from a learned policy and refining them through domain-specific genetic operations such as crossover and mutation.<n>EAM can be seamlessly integrated with state-of-the-art DRL solvers such as the Attention Model, POMO, and SymNCO.
arXiv Detail & Related papers (2025-06-11T05:17:30Z)
DARLEI: Deep Accelerated Reinforcement Learning with Evolutionary Intelligence [77.78795329701367]
We present DARLEI, a framework that combines evolutionary algorithms with parallelized reinforcement learning. We characterize DARLEI's performance under various conditions, revealing factors impacting diversity of evolved morphologies. We hope to extend DARLEI in future work to include interactions between diverse morphologies in richer environments.
arXiv Detail & Related papers (2023-12-08T16:51:10Z)
Evolutionary Strategy Guided Reinforcement Learning via MultiBuffer Communication [0.0]
We introduce a new Evolutionary Reinforcement Learning model which combines a particular family of Evolutionary algorithm called Evolutionary Strategies with the off-policy Deep Reinforcement Learning algorithm TD3. The proposed algorithm is demonstrated to perform competitively with current Evolutionary Reinforcement Learning algorithms on MuJoCo control tasks.
arXiv Detail & Related papers (2023-06-20T13:41:57Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization. We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer. We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z)
Cooperative guidance of multiple missiles: a hybrid co-evolutionary approach [0.9176056742068814]
Cooperative guidance of multiple missiles is a challenging task with rigorous constraints of time and space consensus. This paper develops a novel natural co-evolutionary strategy (NCES) to address the issues of non-stationarity and continuous control faced by cooperative guidance. A hybrid co-evolutionary cooperative guidance law (HCCGL) is proposed by integrating the highly scalable co-evolutionary mechanism and the traditional guidance strategy.
arXiv Detail & Related papers (2022-08-15T12:59:38Z)
Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training. We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z)
Direct Mutation and Crossover in Genetic Algorithms Applied to Reinforcement Learning Tasks [0.9137554315375919]
This paper will focus on applying neuroevolution using a simple genetic algorithm (GA) to find the weights of a neural network that produce optimally behaving agents. We present two novel modifications that improve the data efficiency and speed of convergence when compared to the initial implementation.
arXiv Detail & Related papers (2022-01-13T07:19:28Z)
Behavior-based Neuroevolutionary Training in Reinforcement Learning [3.686320043830301]
This work presents a hybrid algorithm that combines neuroevolutionary optimization with value-based reinforcement learning. For this purpose, we consolidate different methods to generate and optimize agent policies, creating a diverse population. Our results indicate that combining methods can enhance the sample efficiency and learning speed for evolutionary approaches.
arXiv Detail & Related papers (2021-05-17T15:40:42Z)
Epigenetic evolution of deep convolutional models [81.21462458089142]
We build upon a previously proposed neuroevolution framework to evolve deep convolutional models. We propose a convolutional layer layout which allows kernels of different shapes and sizes to coexist within the same layer. The proposed layout enables the size and shape of individual kernels within a convolutional layer to be evolved with a corresponding new mutation operator.
arXiv Detail & Related papers (2021-04-12T12:45:16Z)
Lineage Evolution Reinforcement Learning [15.469857142001482]
Lineage evolution reinforcement learning is a derivative algorithm which accords with the general agent population learning system. Our experiments show that the idea of evolution with lineage improves the performance of original reinforcement learning algorithm in some games in Atari 2600.
arXiv Detail & Related papers (2020-09-26T11:58:16Z)
Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension. We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.