Evolutionary Deep Reinforcement Learning Using Elite Buffer: A Novel
Approach Towards DRL Combined with EA in Continuous Control Tasks
- URL: http://arxiv.org/abs/2209.08480v1
- Date: Sun, 18 Sep 2022 05:56:41 GMT
- Title: Evolutionary Deep Reinforcement Learning Using Elite Buffer: A Novel
Approach Towards DRL Combined with EA in Continuous Control Tasks
- Authors: Marzieh Sadat Esmaeeli, Hamed Malek
- Abstract summary: This study aims to study the efficiency of combining the two fields of deep reinforcement learning and evolutionary computations further.
The "Evolutionary Deep Reinforcement Learning Using Elite Buffer" algorithm introduced a novel mechanism through inspiration from interactive learning capability and hypothetical outcomes in the human brain.
According to the results of experiments, the proposed method surpasses other well-known methods in environments with high complexity and dimension.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the numerous applications and success of deep reinforcement learning
in many control tasks, it still suffers from many crucial problems and
limitations, including temporal credit assignment with sparse reward, absence
of effective exploration, and a brittle convergence that is extremely sensitive
to the hyperparameters of the problem. The problems of deep reinforcement
learning in continuous control, along with the success of evolutionary
algorithms in facing some of these problems, have emerged the idea of
evolutionary reinforcement learning, which attracted many controversies.
Despite successful results in a few studies in this field, a proper and fitting
solution to these problems and their limitations is yet to be presented. The
present study aims to study the efficiency of combining the two fields of deep
reinforcement learning and evolutionary computations further and take a step
towards improving methods and the existing challenges. The "Evolutionary Deep
Reinforcement Learning Using Elite Buffer" algorithm introduced a novel
mechanism through inspiration from interactive learning capability and
hypothetical outcomes in the human brain. In this method, the utilization of
the elite buffer (which is inspired by learning based on experience
generalization in the human mind), along with the existence of crossover and
mutation operators, and interactive learning in successive generations, have
improved efficiency, convergence, and proper advancement in the field of
continuous control. According to the results of experiments, the proposed
method surpasses other well-known methods in environments with high complexity
and dimension and is superior in resolving the mentioned problems and
limitations.
Related papers
- RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Causal Reinforcement Learning: A Survey [57.368108154871]
Reinforcement learning is an essential paradigm for solving sequential decision problems under uncertainty.
One of the main obstacles is that reinforcement learning agents lack a fundamental understanding of the world.
Causality offers a notable advantage as it can formalize knowledge in a systematic manner.
arXiv Detail & Related papers (2023-07-04T03:00:43Z) - Evolutionary Strategy Guided Reinforcement Learning via MultiBuffer
Communication [0.0]
We introduce a new Evolutionary Reinforcement Learning model which combines a particular family of Evolutionary algorithm called Evolutionary Strategies with the off-policy Deep Reinforcement Learning algorithm TD3.
The proposed algorithm is demonstrated to perform competitively with current Evolutionary Reinforcement Learning algorithms on MuJoCo control tasks.
arXiv Detail & Related papers (2023-06-20T13:41:57Z) - Evolutionary Reinforcement Learning: A Survey [31.112066295496003]
Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments.
This article presents a comprehensive survey of state-of-the-art methods for integrating EC into RL, referred to as evolutionary reinforcement learning (EvoRL)
arXiv Detail & Related papers (2023-03-07T01:38:42Z) - Assessing Quality-Diversity Neuro-Evolution Algorithms Performance in
Hard Exploration Problems [10.871978893808533]
Quality-Diversity (QD) methods are evolutionary algorithms inspired by nature's ability to produce high-performing niche organisms.
In this paper, we highlight three candidate benchmarks exhibiting control problems in high dimension with exploration difficulties.
We also provide open-source implementations in Jax allowing practitioners to run fast and numerous experiments on few compute resources.
arXiv Detail & Related papers (2022-11-24T18:04:12Z) - Deep Causal Learning: Representation, Discovery and Inference [2.696435860368848]
Causal learning reveals the essential relationships that underpin phenomena and delineates the mechanisms by which the world evolves.
Traditional causal learning methods face numerous challenges and limitations, including high-dimensional variables, unstructured variables, optimization problems, unobserved confounders, selection biases, and estimation inaccuracies.
Deep causal learning, which leverages deep neural networks, offers innovative insights and solutions for addressing these challenges.
arXiv Detail & Related papers (2022-11-07T09:00:33Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Towards sample-efficient episodic control with DAC-ML [0.5735035463793007]
The sample-inefficiency problem in Artificial Intelligence refers to the inability of current Deep Reinforcement Learning models to optimize action policies within a small number of episodes.
Recent studies have tried to overcome this limitation by adding memory systems and architectural biases to improve learning speed.
In this paper, we capitalize on the design principles of the Distributed Adaptive Control (DAC) theory of mind and brain to build a novel cognitive architecture.
arXiv Detail & Related papers (2020-12-26T16:38:08Z) - Transfer Learning in Deep Reinforcement Learning: A Survey [64.36174156782333]
Reinforcement learning is a learning paradigm for solving sequential decision-making problems.
Recent years have witnessed remarkable progress in reinforcement learning upon the fast development of deep neural networks.
transfer learning has arisen to tackle various challenges faced by reinforcement learning.
arXiv Detail & Related papers (2020-09-16T18:38:54Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.