Adaptable Automation with Modular Deep Reinforcement Learning and Policy
Transfer
- URL: http://arxiv.org/abs/2012.01934v1
- Date: Fri, 27 Nov 2020 03:09:05 GMT
- Title: Adaptable Automation with Modular Deep Reinforcement Learning and Policy
Transfer
- Authors: Zohreh Raziei, Mohsen Moghaddam
- Abstract summary: This article develops and tests a Hyper-Actor Soft Actor-Critic (HASAC) RL framework based on the notions of task modularization and transfer learning.
The HASAC framework is tested on a new virtual robotic manipulation benchmark, Meta-World.
Numerical experiments show superior performance by HASAC over state-of-the-art deep RL algorithms in terms of reward value, success rate, and task completion time.
- Score: 8.299945169799795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in deep Reinforcement Learning (RL) have created
unprecedented opportunities for intelligent automation, where a machine can
autonomously learn an optimal policy for performing a given task. However,
current deep RL algorithms predominantly specialize in a narrow range of tasks,
are sample inefficient, and lack sufficient stability, which in turn hinder
their industrial adoption. This article tackles this limitation by developing
and testing a Hyper-Actor Soft Actor-Critic (HASAC) RL framework based on the
notions of task modularization and transfer learning. The goal of the proposed
HASAC is to enhance the adaptability of an agent to new tasks by transferring
the learned policies of former tasks to the new task via a "hyper-actor". The
HASAC framework is tested on a new virtual robotic manipulation benchmark,
Meta-World. Numerical experiments show superior performance by HASAC over
state-of-the-art deep RL algorithms in terms of reward value, success rate, and
task completion time.
Related papers
- Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents [9.529492371336286]
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors.
We propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS)
LSTS learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification.
arXiv Detail & Related papers (2024-02-06T04:00:21Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Self-Supervised Curriculum Generation for Autonomous Reinforcement
Learning without Task-Specific Knowledge [25.168236693829783]
A significant bottleneck in applying current reinforcement learning algorithms to real-world scenarios is the need to reset the environment between every episode.
We propose a novel ARL algorithm that can generate a curriculum adaptive to the agent's learning progress without task-specific knowledge.
arXiv Detail & Related papers (2023-11-15T18:40:10Z) - Enhancing Robotic Manipulation: Harnessing the Power of Multi-Task
Reinforcement Learning and Single Life Reinforcement Learning in Meta-World [0.0]
This research project is to enable a robotic arm to execute seven distinct tasks within the Meta World environment.
A trained model will serve as a source of prior data for the single-life RL algorithm.
An ablation study demonstrates that MT-QWALE successfully completes tasks with a slightly larger number of steps even after hiding the final goal position.
arXiv Detail & Related papers (2023-10-23T06:35:44Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Lean Evolutionary Reinforcement Learning by Multitasking with Importance
Sampling [20.9680985132322]
We introduce a novel neuroevolutionary multitasking (NuEMT) algorithm to transfer information from a set of auxiliary tasks to the target (full length) RL task.
We demonstrate that the NuEMT algorithm data-lean evolutionary RL, reducing expensive agent-environment interaction data requirements.
arXiv Detail & Related papers (2022-03-21T10:06:16Z) - URLB: Unsupervised Reinforcement Learning Benchmark [82.36060735454647]
We introduce the Unsupervised Reinforcement Learning Benchmark (URLB)
URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards.
We provide twelve continuous control tasks from three domains for evaluation and open-source code for eight leading unsupervised RL methods.
arXiv Detail & Related papers (2021-10-28T15:07:01Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic
through Gaussian Processes and Control Barrier Functions [3.5897534810405403]
Reinforcement learning (RL) is a promising approach and has limited success towards real-world applications.
In this paper, we propose a learning-based control framework consisting of several aspects.
We show such an ECBF-based modular deep RL algorithm achieves near-perfect success rates and guard safety with a high probability.
arXiv Detail & Related papers (2021-09-07T00:51:12Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.