On Credit Assignment in Hierarchical Reinforcement Learning
- URL: http://arxiv.org/abs/2203.03292v1
- Date: Mon, 7 Mar 2022 11:13:09 GMT
- Title: On Credit Assignment in Hierarchical Reinforcement Learning
- Authors: Joery A. de Vries, Thomas M. Moerland, Aske Plaat
- Abstract summary: Hierarchical Reinforcement Learning (HRL) has held longstanding promise to advance reinforcement learning.
We show how e.g., a 1-step hierarchical backup' can be seen as a conventional multistep backup with $n$ skip connections over time.
We develop a new hierarchical algorithm Hier$Q_k(lambda)$, for which we demonstrate that hierarchical credit assignment alone can already boost agent performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hierarchical Reinforcement Learning (HRL) has held longstanding promise to
advance reinforcement learning. Yet, it has remained a considerable challenge
to develop practical algorithms that exhibit some of these promises. To improve
our fundamental understanding of HRL, we investigate hierarchical credit
assignment from the perspective of conventional multistep reinforcement
learning. We show how e.g., a 1-step `hierarchical backup' can be seen as a
conventional multistep backup with $n$ skip connections over time connecting
each subsequent state to the first independent of actions inbetween.
Furthermore, we find that generalizing hierarchy to multistep return estimation
methods requires us to consider how to partition the environment trace, in
order to construct backup paths. We leverage these insight to develop a new
hierarchical algorithm Hier$Q_k(\lambda)$, for which we demonstrate that
hierarchical credit assignment alone can already boost agent performance (i.e.,
when eliminating generalization or exploration). Altogether, our work yields
fundamental insight into the nature of hierarchical backups and distinguishes
this as an additional basis for reinforcement learning research.
Related papers
- Reinforcement Learning with Options and State Representation [105.82346211739433]
This thesis aims to explore the reinforcement learning field and build on existing methods to produce improved ones.
It addresses such goals by decomposing learning tasks in a hierarchical fashion known as Hierarchical Reinforcement Learning.
arXiv Detail & Related papers (2024-03-16T08:30:55Z) - Hierarchical Decomposition of Prompt-Based Continual Learning:
Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice.
HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics.
Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - Possibility Before Utility: Learning And Using Hierarchical Affordances [21.556661319375255]
Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures.
We present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning.
arXiv Detail & Related papers (2022-03-23T19:17:22Z) - HCV: Hierarchy-Consistency Verification for Incremental
Implicitly-Refined Classification [48.68128465443425]
Human beings learn and accumulate hierarchical knowledge over their lifetime.
Current incremental learning methods lack the ability to build a concept hierarchy by associating new concepts to old ones.
We propose Hierarchy-Consistency Verification (HCV) as an enhancement to existing continual learning methods.
arXiv Detail & Related papers (2021-10-21T13:54:00Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task.
We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy.
Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z) - Attaining Interpretability in Reinforcement Learning via Hierarchical
Primitive Composition [3.1078562713129765]
We propose a novel hierarchical reinforcement learning algorithm that mitigates the aforementioned issues by decomposing the original task in a hierarchy.
We show how the proposed scheme can be employed in practice by solving a pick and place task with a 6 DoF manipulator.
arXiv Detail & Related papers (2021-10-05T05:59:31Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.