Learning with AMIGo: Adversarially Motivated Intrinsic Goals
- URL: http://arxiv.org/abs/2006.12122v2
- Date: Tue, 23 Feb 2021 07:57:41 GMT
- Title: Learning with AMIGo: Adversarially Motivated Intrinsic Goals
- Authors: Andres Campero, Roberta Raileanu, Heinrich K\"uttler, Joshua B.
Tenenbaum, Tim Rockt\"aschel, Edward Grefenstette
- Abstract summary: AMIGo is a goal-generating teacher that proposes Adversarially Motivated Intrinsic Goals.
We show that our method generates a natural curriculum of self-proposed goals which ultimately allows the agent to solve challenging procedurally-generated tasks.
- Score: 63.680207855344875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A key challenge for reinforcement learning (RL) consists of learning in
environments with sparse extrinsic rewards. In contrast to current RL methods,
humans are able to learn new skills with little or no reward by using various
forms of intrinsic motivation. We propose AMIGo, a novel agent incorporating --
as form of meta-learning -- a goal-generating teacher that proposes
Adversarially Motivated Intrinsic Goals to train a goal-conditioned "student"
policy in the absence of (or alongside) environment reward. Specifically,
through a simple but effective "constructively adversarial" objective, the
teacher learns to propose increasingly challenging -- yet achievable -- goals
that allow the student to learn general skills for acting in a new environment,
independent of the task to be solved. We show that our method generates a
natural curriculum of self-proposed goals which ultimately allows the agent to
solve challenging procedurally-generated tasks where other forms of intrinsic
motivation and state-of-the-art RL methods fail.
Related papers
- Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum
Generation [107.10235120286352]
Training general-purpose reinforcement learning agents efficiently requires automatic generation of a goal curriculum.
We propose Curriculum Self Play (CuSP), an automated goal generation framework.
We demonstrate that our method succeeds at generating an effective curricula of goals for a range of control tasks.
arXiv Detail & Related papers (2022-02-22T01:23:23Z) - Unsupervised Domain Adaptation with Dynamics-Aware Rewards in
Reinforcement Learning [28.808933152885874]
Unconditioned reinforcement learning aims to acquire skills without prior goal representations.
The intuitive approach of training in another interaction-rich environment disrupts the trained skills in the target environment.
We propose an unsupervised domain adaptation method to identify and acquire skills across dynamics.
arXiv Detail & Related papers (2021-10-25T14:40:48Z) - Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short
Survey [21.311739361361717]
Developmental approaches argue that learning agents must generate, select and learn to solve their own problems.
Recent years have seen a convergence of developmental approaches and deep reinforcement learning (RL) methods, forming the new domain of developmental machine learning.
This paper proposes a typology of these methods at the intersection of deep RL and developmental approaches, surveys recent approaches and discusses future avenues.
arXiv Detail & Related papers (2020-12-17T18:51:40Z) - Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning
with Intrinsic-Extrinsic Modeling [33.89793938441333]
We present a novel policy learning paradigm for the object search task, based on hierarchical and interpretable modeling with an intrinsic-extrinsic reward setting.
Experiments conducted on the House3D environment validate and show that the robot, trained with our model, can perform the object search task in a more optimal and interpretable way.
arXiv Detail & Related papers (2020-10-16T19:21:38Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - MetaCURE: Meta Reinforcement Learning with Empowerment-Driven
Exploration [52.48362697163477]
Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks.
We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning.
We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
arXiv Detail & Related papers (2020-06-15T06:56:18Z) - Generating Automatic Curricula via Self-Supervised Active Domain
Randomization [11.389072560141388]
We extend the self-play framework to jointly learn a goal and environment curriculum.
Our method generates a coupled goal-task curriculum, where agents learn through progressively more difficult tasks and environment variations.
Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.
arXiv Detail & Related papers (2020-02-18T22:45:29Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.