Is Curiosity All You Need? On the Utility of Emergent Behaviours from
Curious Exploration
- URL: http://arxiv.org/abs/2109.08603v1
- Date: Fri, 17 Sep 2021 15:28:25 GMT
- Title: Is Curiosity All You Need? On the Utility of Emergent Behaviours from
Curious Exploration
- Authors: Oliver Groth, Markus Wulfmeier, Giulia Vezzani, Vibhavari Dasagi, Tim
Hertweck, Roland Hafner, Nicolas Heess, Martin Riedmiller
- Abstract summary: We argue that merely using curiosity for fast environment exploration or as a bonus reward for a specific task does not harness the full potential of this technique.
We propose to shift the focus towards retaining the behaviours which emerge during curiosity-based learning.
- Score: 20.38772636693469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Curiosity-based reward schemes can present powerful exploration mechanisms
which facilitate the discovery of solutions for complex, sparse or long-horizon
tasks. However, as the agent learns to reach previously unexplored spaces and
the objective adapts to reward new areas, many behaviours emerge only to
disappear due to being overwritten by the constantly shifting objective. We
argue that merely using curiosity for fast environment exploration or as a
bonus reward for a specific task does not harness the full potential of this
technique and misses useful skills. Instead, we propose to shift the focus
towards retaining the behaviours which emerge during curiosity-based learning.
We posit that these self-discovered behaviours serve as valuable skills in an
agent's repertoire to solve related tasks. Our experiments demonstrate the
continuous shift in behaviour throughout training and the benefits of a simple
policy snapshot method to reuse discovered behaviour for transfer tasks.
Related papers
- A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning [58.107474025048866]
Forgetting refers to the loss or deterioration of previously acquired knowledge.
Forgetting is a prevalent phenomenon observed in various other research domains within deep learning.
arXiv Detail & Related papers (2023-07-16T16:27:58Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Unsupervised Reinforcement Learning for Transferable Manipulation Skill
Discovery [22.32327908453603]
Current reinforcement learning (RL) in robotics often experiences difficulty in generalizing to new downstream tasks.
We propose a framework that pre-trains the agent in a task-agnostic manner without access to the task-specific reward.
We show that our approach achieves the most diverse interacting behavior and significantly improves sample efficiency in downstream tasks.
arXiv Detail & Related papers (2022-04-29T06:57:46Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Learning Task Agnostic Skills with Data-driven Guidance [0.0]
This paper proposes a framework for guiding the skill discovery towards the subset of expert-visited states.
We apply our method in various reinforcement learning tasks and show that such a projection results in more useful behaviours.
arXiv Detail & Related papers (2021-08-04T06:53:10Z) - Open-Ended Learning Leads to Generally Capable Agents [12.079718607356178]
We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are capable across this vast space and beyond.
The resulting space is exceptionally diverse in terms of the challenges posed to agents, and as such, even measuring the learning progress of an agent is an open research problem.
We show that through constructing an open-ended learning process, which dynamically changes the training task distributions and training objectives such that the agent never stops learning, we achieve consistent learning of new behaviours.
arXiv Detail & Related papers (2021-07-27T13:30:07Z) - Touch-based Curiosity for Sparse-Reward Tasks [15.766198618516137]
We use surprise from mismatches in touch feedback to guide exploration in hard sparse-reward reinforcement learning tasks.
Our approach, Touch-based Curiosity (ToC), learns what visible objects interactions are supposed to "feel" like.
We test our approach on a range of touch-intensive robot arm tasks.
arXiv Detail & Related papers (2021-04-01T12:49:29Z) - Bridging the Imitation Gap by Adaptive Insubordination [88.35564081175642]
We show that when the teaching agent makes decisions with access to privileged information, this information is marginalized during imitation learning.
We propose 'Adaptive Insubordination' (ADVISOR) to address this gap.
ADVISOR dynamically weights imitation and reward-based reinforcement learning losses during training, enabling on-the-fly switching between imitation and exploration.
arXiv Detail & Related papers (2020-07-23T17:59:57Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z) - Weakly-Supervised Reinforcement Learning for Controllable Behavior [126.04932929741538]
Reinforcement learning (RL) is a powerful framework for learning to take actions to solve tasks.
In many settings, an agent must winnow down the inconceivably large space of all possible tasks to the single task that it is currently being asked to solve.
We introduce a framework for using weak supervision to automatically disentangle this semantically meaningful subspace of tasks from the enormous space of nonsensical "chaff" tasks.
arXiv Detail & Related papers (2020-04-06T17:50:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.