Enabling surrogate-assisted evolutionary reinforcement learning via
policy embedding
- URL: http://arxiv.org/abs/2301.13374v1
- Date: Tue, 31 Jan 2023 02:36:06 GMT
- Title: Enabling surrogate-assisted evolutionary reinforcement learning via
policy embedding
- Authors: Lan Tang, Xiaxi Li, Jinyuan Zhang, Guiying Li, Peng Yang and Ke Tang
- Abstract summary: This paper proposes a PE-SAERL Framework to enable surrogate-assisted evolutionary reinforcement learning via policy embedding.
Empirical results on 5 Atari games show that the proposed method can perform more efficiently than the four state-of-the-art algorithms.
- Score: 28.272572839321104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Evolutionary Reinforcement Learning (ERL) that applying Evolutionary
Algorithms (EAs) to optimize the weight parameters of Deep Neural Network (DNN)
based policies has been widely regarded as an alternative to traditional
reinforcement learning methods. However, the evaluation of the iteratively
generated population usually requires a large amount of computational time and
can be prohibitively expensive, which may potentially restrict the
applicability of ERL. Surrogate is often used to reduce the computational
burden of evaluation in EAs. Unfortunately, in ERL, each individual of policy
usually represents millions of weights parameters of DNN. This high-dimensional
representation of policy has introduced a great challenge to the application of
surrogates into ERL to speed up training. This paper proposes a PE-SAERL
Framework to at the first time enable surrogate-assisted evolutionary
reinforcement learning via policy embedding (PE). Empirical results on 5 Atari
games show that the proposed method can perform more efficiently than the four
state-of-the-art algorithms. The training process is accelerated up to 7x on
tested games, comparing to its counterpart without the surrogate and PE.
Related papers
- GPU-Accelerated Rule Evaluation and Evolution [10.60691612679966]
This paper introduces an innovative approach to boost the efficiency and scalability of Evolutionary Rule-based machine Learning (ERL)
The method proposed in this paper, AERL (Accelerated ERL) solves this problem in two ways.
First, by adopting GPU-optimized rule sets through a tensorized representation within the PyTorch framework, AERL mitigates the bottleneck and accelerates fitness evaluation significantly.
Second, AERL takes further advantage of the GPU by fine-tuning the rule coefficients via back-propagation, thereby improving search space exploration.
arXiv Detail & Related papers (2024-06-03T22:24:12Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.
In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.
We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression [53.33734159983431]
This paper introduces a novel approach to distill neural RL policies into more interpretable forms.
We train expert neural network policies using RL and distill them into (i) GBMs, (ii) EBMs, and (iii) symbolic policies.
arXiv Detail & Related papers (2024-03-21T11:54:45Z) - ADAPTER-RL: Adaptation of Any Agent using Reinforcement Learning [0.0]
adapters have proven effective in supervised learning contexts such as natural language processing and computer vision.
This paper presents an innovative adaptation strategy that demonstrates enhanced training efficiency and improvement of the base-agent.
Our proposed universal approach is not only compatible with pre-trained neural networks but also with rule-based agents, offering a means to integrate human expertise.
arXiv Detail & Related papers (2023-11-20T04:54:51Z) - Statistically Efficient Variance Reduction with Double Policy Estimation
for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation.
We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z) - Reinforcement Learning-assisted Evolutionary Algorithm: A Survey and
Research Opportunities [63.258517066104446]
Reinforcement learning integrated as a component in the evolutionary algorithm has demonstrated superior performance in recent years.
We discuss the RL-EA integration method, the RL-assisted strategy adopted by RL-EA, and its applications according to the existing literature.
In the applications of RL-EA section, we also demonstrate the excellent performance of RL-EA on several benchmarks and a range of public datasets.
arXiv Detail & Related papers (2023-08-25T15:06:05Z) - A reinforcement learning strategy for p-adaptation in high order solvers [0.0]
Reinforcement learning (RL) has emerged as a promising approach to automating decision processes.
This paper explores the application of RL techniques to optimise the order in the computational mesh when using high-order solvers.
arXiv Detail & Related papers (2023-06-14T07:01:31Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Supplementing Gradient-Based Reinforcement Learning with Simple
Evolutionary Ideas [4.873362301533824]
We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL)
The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space.
arXiv Detail & Related papers (2023-05-10T09:46:53Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Deep Networks with Fast Retraining [0.0]
This paper proposes a novel MP inverse-based fast retraining strategy for deep convolutional neural network (DCNN) learning.
In each training, a random learning strategy that controls the number of convolutional layers trained in the backward pass is first utilized.
Then, an MP inverse-based batch-by-batch learning strategy, which enables the network to be implemented without access to industrial-scale computational resources, is developed.
arXiv Detail & Related papers (2020-08-13T15:17:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.