Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning
- URL: http://arxiv.org/abs/2306.11483v1
- Date: Tue, 20 Jun 2023 12:12:16 GMT
- Title: Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning
- Authors: Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai B\^ace, Stefan
Leutenegger, Andreas Bulling
- Abstract summary: Int-HRL: Hierarchical RL with intention-based sub-goals that are inferred from human eye gaze.
Our evaluations show that replacing hand-crafted sub-goals with automatically extracted intentions leads to a HRL agent that is significantly more sample efficient than previous methods.
- Score: 23.062590084580542
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While deep reinforcement learning (RL) agents outperform humans on an
increasing number of tasks, training them requires data equivalent to decades
of human gameplay. Recent hierarchical RL methods have increased sample
efficiency by incorporating information inherent to the structure of the
decision problem but at the cost of having to discover or use human-annotated
sub-goals that guide the learning process. We show that intentions of human
players, i.e. the precursor of goal-oriented decisions, can be robustly
predicted from eye gaze even for the long-horizon sparse rewards task of
Montezuma's Revenge - one of the most challenging RL tasks in the Atari2600
game suite. We propose Int-HRL: Hierarchical RL with intention-based sub-goals
that are inferred from human eye gaze. Our novel sub-goal extraction pipeline
is fully automatic and replaces the need for manual sub-goal annotation by
human experts. Our evaluations show that replacing hand-crafted sub-goals with
automatically extracted intentions leads to a HRL agent that is significantly
more sample efficient than previous methods.
Related papers
- Leveraging Reward Consistency for Interpretable Feature Discovery in
Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents.
We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents.
We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z) - Primitive Skill-based Robot Learning from Human Evaluative Feedback [28.046559859978597]
Reinforcement learning algorithms face challenges when dealing with long-horizon robot manipulation tasks in real-world environments.
We propose a novel framework, SEED, which leverages two approaches: reinforcement learning from human feedback (RLHF) and primitive skill-based reinforcement learning.
Our results show that SEED significantly outperforms state-of-the-art RL algorithms in sample efficiency and safety.
arXiv Detail & Related papers (2023-07-28T20:48:30Z) - Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem.
We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Light-weight probing of unsupervised representations for Reinforcement Learning [20.638410483549706]
We study whether linear probing can be a proxy evaluation task for the quality of unsupervised RL representation.
We show that the probing tasks are strongly rank correlated with the downstream RL performance on the Atari100k Benchmark.
This provides a more efficient method for exploring the space of pretraining algorithms and identifying promising pretraining recipes.
arXiv Detail & Related papers (2022-08-25T21:08:01Z) - Contrastive Learning as Goal-Conditioned Reinforcement Learning [147.28638631734486]
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable.
We show (contrastive) representation learning methods can be cast as RL algorithms in their own right.
arXiv Detail & Related papers (2022-06-15T14:34:15Z) - Reward Uncertainty for Exploration in Preference-based Reinforcement
Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms.
Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward.
Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Machine versus Human Attention in Deep Reinforcement Learning Tasks [38.80270891345248]
We shed light on the inner workings of such trained models by analyzing the pixels that they attend to during task execution.
We compare the saliency maps of RL agents against visual attention models of human experts when learning to play Atari games.
arXiv Detail & Related papers (2020-10-29T20:58:45Z) - Hierarchical Reinforcement Learning in StarCraft II with Human Expertise
in Subgoals Selection [13.136763521789307]
We propose a new method to integrate HRL, experience replay and effective subgoal selection through an implicit curriculum design based on human expertise.
Our method can achieve better sample efficiency than flat and end-to-end RL methods, and provides an effective method for explaining the agent's performance.
arXiv Detail & Related papers (2020-08-08T04:56:30Z) - Accelerating Reinforcement Learning Agent with EEG-based Implicit Human
Feedback [10.138798960466222]
Reinforcement Learning (RL) agents with human feedback can dramatically improve various aspects of learning.
Previous methods require human observer to give inputs explicitly, burdening the human in the loop of RL agent's learning process.
We investigate capturing human's intrinsic reactions as implicit (and natural) feedback through EEG in the form of error-related potentials (ErrP)
arXiv Detail & Related papers (2020-06-30T03:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.