Show me the Way: Intrinsic Motivation from Demonstrations
- URL: http://arxiv.org/abs/2006.12917v2
- Date: Wed, 13 Jan 2021 14:36:49 GMT
- Title: Show me the Way: Intrinsic Motivation from Demonstrations
- Authors: L\'eonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin
- Abstract summary: We show that complex exploration behaviors, reflecting different motivations, can be learnt and efficiently used by RL agents to solve tasks for which exhaustive exploration is prohibitive.
We propose to learn an exploration bonus from demonstrations that could transfer these motivations to an artificial agent with little assumptions about their rationale.
- Score: 44.87651595571687
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The study of exploration in the domain of decision making has a long history
but remains actively debated. From the vast literature that addressed this
topic for decades under various points of view (e.g., developmental psychology,
experimental design, artificial intelligence), intrinsic motivation emerged as
a concept that can practically be transferred to artificial agents. Especially,
in the recent field of Deep Reinforcement Learning (RL), agents implement such
a concept (mainly using a novelty argument) in the shape of an exploration
bonus, added to the task reward, that encourages visiting the whole
environment. This approach is supported by the large amount of theory on RL for
which convergence to optimality assumes exhaustive exploration. Yet, Human
Beings and mammals do not exhaustively explore the world and their motivation
is not only based on novelty but also on various other factors (e.g.,
curiosity, fun, style, pleasure, safety, competition, etc.). They optimize for
life-long learning and train to learn transferable skills in playgrounds
without obvious goals. They also apply innate or learned priors to save time
and stay safe. For these reasons, we propose to learn an exploration bonus from
demonstrations that could transfer these motivations to an artificial agent
with little assumptions about their rationale. Using an inverse RL approach, we
show that complex exploration behaviors, reflecting different motivations, can
be learnt and efficiently used by RL agents to solve tasks for which exhaustive
exploration is prohibitive.
Related papers
- First Go, then Post-Explore: the Benefits of Post-Exploration in
Intrinsic Motivation [7.021281655855703]
Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards.
Key insight of Go-Explore was that successful exploration requires an agent to first return to an interesting state.
We refer to such exploration after a goal is reached as 'post-exploration'
arXiv Detail & Related papers (2022-12-06T18:56:47Z) - Intrinsically-Motivated Reinforcement Learning: A Brief Introduction [0.0]
Reinforcement learning (RL) is one of the three basic paradigms of machine learning.
In this paper, we investigated the problem of improving exploration in RL and introduced the intrinsically-motivated RL.
arXiv Detail & Related papers (2022-03-03T12:39:58Z) - Long-Term Exploration in Persistent MDPs [68.8204255655161]
We propose an exploration method called Rollback-Explore (RbExplore)
In this paper, we propose an exploration method called Rollback-Explore (RbExplore), which utilizes the concept of the persistent Markov decision process.
We test our algorithm in the hard-exploration Prince of Persia game, without rewards and domain knowledge.
arXiv Detail & Related papers (2021-09-21T13:47:04Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Mutual Information State Intrinsic Control [91.38627985733068]
Intrinsically motivated RL attempts to remove this constraint by defining an intrinsic reward function.
Motivated by the self-consciousness concept in psychology, we make a natural assumption that the agent knows what constitutes itself.
We mathematically formalize this reward as the mutual information between the agent state and the surrounding state.
arXiv Detail & Related papers (2021-03-15T03:03:36Z) - Fast active learning for pure exploration in reinforcement learning [48.98199700043158]
We show that bonuses that scale with $1/n$ bring faster learning rates, improving the known upper bounds with respect to the dependence on the horizon.
We also show that with an improved analysis of the stopping time, we can improve by a factor $H$ the sample complexity in the best-policy identification setting.
arXiv Detail & Related papers (2020-07-27T11:28:32Z) - See, Hear, Explore: Curiosity via Audio-Visual Association [46.86865495827888]
A common formulation of curiosity-driven exploration uses the difference between the real future and the future predicted by a learned model.
In this paper, we introduce an alternative form of curiosity that rewards novel associations between different senses.
Our approach exploits multiple modalities to provide a stronger signal for more efficient exploration.
arXiv Detail & Related papers (2020-07-07T17:56:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.