Lean Evolutionary Reinforcement Learning by Multitasking with Importance
Sampling
- URL: http://arxiv.org/abs/2203.10844v1
- Date: Mon, 21 Mar 2022 10:06:16 GMT
- Title: Lean Evolutionary Reinforcement Learning by Multitasking with Importance
Sampling
- Authors: Nick Zhang, Abhishek Gupta, Zefeng Chen, and Yew-Soon Ong
- Abstract summary: We introduce a novel neuroevolutionary multitasking (NuEMT) algorithm to transfer information from a set of auxiliary tasks to the target (full length) RL task.
We demonstrate that the NuEMT algorithm data-lean evolutionary RL, reducing expensive agent-environment interaction data requirements.
- Score: 20.9680985132322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Studies have shown evolution strategies (ES) to be a promising approach for
reinforcement learning (RL) with deep neural networks. However, the issue of
high sample complexity persists in applications of ES to deep RL. In this
paper, we address the shortcoming of today's methods via a novel
neuroevolutionary multitasking (NuEMT) algorithm, designed to transfer
information from a set of auxiliary tasks (of short episode length) to the
target (full length) RL task at hand. The artificially generated auxiliary
tasks allow an agent to update and quickly evaluate policies on shorter time
horizons. The evolved skills are then transferred to guide the longer and
harder task towards an optimal policy. We demonstrate that the NuEMT algorithm
achieves data-lean evolutionary RL, reducing expensive agent-environment
interaction data requirements. Our key algorithmic contribution in this setting
is to introduce, for the first time, a multitask information transfer mechanism
based on the statistical importance sampling technique. In addition, an
adaptive resource allocation strategy is utilized to assign computational
resources to auxiliary tasks based on their gleaned usefulness. Experiments on
a range of continuous control tasks from the OpenAI Gym confirm that our
proposed algorithm is efficient compared to recent ES baselines.
Related papers
- A Method for Fast Autonomy Transfer in Reinforcement Learning [3.8049020806504967]
This paper introduces a novel reinforcement learning (RL) strategy designed to facilitate rapid autonomy transfer.
Unlike traditional methods that require extensive retraining or fine-tuning, our approach integrates existing knowledge, enabling an RL agent to adapt swiftly to new settings.
arXiv Detail & Related papers (2024-07-29T23:48:07Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - An advantage based policy transfer algorithm for reinforcement learning
with metrics of transferability [6.660458629649826]
Reinforcement learning (RL) can enable sequential decision-making in complex and high-dimensional environments.
transfer RL algorithms can be used for the transfer of knowledge from one or multiple source environments to a target environment.
This paper proposes an off-policy Advantage-based Policy Transfer algorithm, APT-RL, for fixed domain environments.
arXiv Detail & Related papers (2023-11-12T04:25:53Z) - Supervised Pretraining Can Learn In-Context Reinforcement Learning [96.62869749926415]
In this paper, we study the in-context learning capabilities of transformers in decision-making problems.
We introduce and study Decision-Pretrained Transformer (DPT), a supervised pretraining method where the transformer predicts an optimal action.
We find that the pretrained transformer can be used to solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline.
arXiv Detail & Related papers (2023-06-26T17:58:50Z) - Human-Inspired Framework to Accelerate Reinforcement Learning [1.6317061277457001]
Reinforcement learning (RL) is crucial for data science decision-making but suffers from sample inefficiency.
This paper introduces a novel human-inspired framework to enhance RL algorithm sample efficiency.
arXiv Detail & Related papers (2023-02-28T13:15:04Z) - The Cost of Learning: Efficiency vs. Efficacy of Learning-Based RRM for
6G [10.28841351455586]
Deep Reinforcement Learning (DRL) has become a valuable solution to automatically learn efficient resource management strategies in complex networks.
In many scenarios, the learning task is performed in the Cloud, while experience samples are generated directly by edge nodes or users.
This creates a friction between the need to speed up convergence towards an effective strategy, which requires the allocation of resources to transmit learning samples.
We propose a dynamic balancing strategy between the learning and data planes, which allows the centralized learning agent to quickly converge to an efficient resource allocation strategy.
arXiv Detail & Related papers (2022-11-30T11:26:01Z) - Efficient Meta Reinforcement Learning for Preference-based Fast
Adaptation [17.165083095799712]
We study the problem of few-shot adaptation in the context of human-in-the-loop reinforcement learning.
We develop a meta-RL algorithm that enables fast policy adaptation with preference-based feedback.
arXiv Detail & Related papers (2022-11-20T03:55:09Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless
Cellular Networks [82.02891936174221]
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach.
In this paper, a novel semantic-aware CDRL method is proposed to enable a group of untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network.
arXiv Detail & Related papers (2021-11-23T18:24:47Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.