Hierarchical reinforcement learning for efficient exploration and
transfer
- URL: http://arxiv.org/abs/2011.06335v1
- Date: Thu, 12 Nov 2020 12:09:13 GMT
- Title: Hierarchical reinforcement learning for efficient exploration and
transfer
- Authors: Lorenzo Steccanella, Simone Totaro, Damien Allonsius, Anders Jonsson
- Abstract summary: We present a novel hierarchical reinforcement learning framework based on the compression of an invariant state space.
Results indicate that the algorithm can successfully solve complex sparse-reward domains, and transfer knowledge to solve new, previously unseen tasks more quickly.
- Score: 7.70406430636194
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Sparse-reward domains are challenging for reinforcement learning algorithms
since significant exploration is needed before encountering reward for the
first time. Hierarchical reinforcement learning can facilitate exploration by
reducing the number of decisions necessary before obtaining a reward. In this
paper, we present a novel hierarchical reinforcement learning framework based
on the compression of an invariant state space that is common to a range of
tasks. The algorithm introduces subtasks which consist of moving between the
state partitions induced by the compression. Results indicate that the
algorithm can successfully solve complex sparse-reward domains, and transfer
knowledge to solve new, previously unseen tasks more quickly.
Related papers
- Reconciling Spatial and Temporal Abstractions for Goal Representation [0.4813333335683418]
Goal representation affects the performance of Hierarchical Reinforcement Learning (HRL) algorithms.
Recent studies show that representations that preserve temporally abstract environment dynamics are successful in solving difficult problems.
We propose a novel three-layer HRL algorithm that introduces, at different levels of the hierarchy, both a spatial and a temporal goal abstraction.
arXiv Detail & Related papers (2024-01-18T10:33:30Z) - Reward-Predictive Clustering [20.82575016038573]
We provide a clustering algorithm that enables the application of reward-predictive state abstractions to deep learning settings.
A convergence theorem and simulations show that the resulting reward-predictive deep network maximally compresses the agent's inputs.
arXiv Detail & Related papers (2022-11-07T03:13:26Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Temporal Abstractions-Augmented Temporally Contrastive Learning: An
Alternative to the Laplacian in RL [140.12803111221206]
In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting.
We propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation.
We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments.
arXiv Detail & Related papers (2022-03-21T22:07:48Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Attaining Interpretability in Reinforcement Learning via Hierarchical
Primitive Composition [3.1078562713129765]
We propose a novel hierarchical reinforcement learning algorithm that mitigates the aforementioned issues by decomposing the original task in a hierarchy.
We show how the proposed scheme can be employed in practice by solving a pick and place task with a 6 DoF manipulator.
arXiv Detail & Related papers (2021-10-05T05:59:31Z) - HAC Explore: Accelerating Exploration with Hierarchical Reinforcement
Learning [8.889563735540696]
We propose HAC Explore (HACx), a new method that combines the exploration bonus method Random Network Distillation (RND) into the hierarchical approach Hierarchical Actor-Critic (HAC)
HACx is the first RL method to solve a sparse reward, continuous-control task that requires over 1,000 actions.
arXiv Detail & Related papers (2021-08-12T17:42:12Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - Reannealing of Decaying Exploration Based On Heuristic Measure in Deep
Q-Network [82.20059754270302]
We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed.
We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.
arXiv Detail & Related papers (2020-09-29T20:40:00Z) - Sequential Transfer in Reinforcement Learning with a Generative Model [48.40219742217783]
We show how to reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.
We derive PAC bounds on its sample complexity which clearly demonstrate the benefits of using this kind of prior knowledge.
We empirically verify our theoretical findings in simple simulated domains.
arXiv Detail & Related papers (2020-07-01T19:53:35Z) - SPACE: Structured Compression and Sharing of Representational Space for
Continual Learning [10.06017287116299]
incrementally learning tasks causes artificial neural networks to overwrite relevant information learned about older tasks, resulting in 'Catastrophic Forgetting'
We propose SPACE, an algorithm that enables a network to learn continually and efficiently by partitioning the learnt space into a Core space.
We evaluate our algorithm on P-MNIST, CIFAR and a sequence of 8 different datasets, and achieve comparable accuracy to the state-of-the-art methods.
arXiv Detail & Related papers (2020-01-23T16:40:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.