Generating Automatic Curricula via Self-Supervised Active Domain
Randomization
- URL: http://arxiv.org/abs/2002.07911v2
- Date: Mon, 26 Oct 2020 18:24:29 GMT
- Title: Generating Automatic Curricula via Self-Supervised Active Domain
Randomization
- Authors: Sharath Chandra Raparthy, Bhairav Mehta, Florian Golemo, Liam Paull
- Abstract summary: We extend the self-play framework to jointly learn a goal and environment curriculum.
Our method generates a coupled goal-task curriculum, where agents learn through progressively more difficult tasks and environment variations.
Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.
- Score: 11.389072560141388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Goal-directed Reinforcement Learning (RL) traditionally considers an agent
interacting with an environment, prescribing a real-valued reward to an agent
proportional to the completion of some goal. Goal-directed RL has seen large
gains in sample efficiency, due to the ease of reusing or generating new
experience by proposing goals. One approach,self-play, allows an agent to
"play" against itself by alternatively setting and accomplishing goals,
creating a learned curriculum through which an agent can learn to accomplish
progressively more difficult goals. However, self-play has been limited to goal
curriculum learning or learning progressively harder goals within a single
environment. Recent work on robotic agents has shown that varying the
environment during training, for example with domain randomization, leads to
more robust transfer. As a result, we extend the self-play framework to jointly
learn a goal and environment curriculum, leading to an approach that learns the
most fruitful domain randomization strategy with self-play. Our method,
Self-Supervised Active Domain Randomization(SS-ADR), generates a coupled
goal-task curriculum, where agents learn through progressively more difficult
tasks and environment variations. By encouraging the agent to try tasks that
are just outside of its current capabilities, SS-ADR builds a domain
randomization curriculum that enables state-of-the-art results on
varioussim2real transfer tasks. Our results show that a curriculum of
co-evolving the environment difficulty together with the difficulty of goals
set in each environment provides practical benefits in the goal-directed tasks
tested.
Related papers
- Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Autonomous Open-Ended Learning of Tasks with Non-Stationary
Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals.
While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks.
In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture.
Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z) - It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum
Generation [107.10235120286352]
Training general-purpose reinforcement learning agents efficiently requires automatic generation of a goal curriculum.
We propose Curriculum Self Play (CuSP), an automated goal generation framework.
We demonstrate that our method succeeds at generating an effective curricula of goals for a range of control tasks.
arXiv Detail & Related papers (2022-02-22T01:23:23Z) - Automatic Goal Generation using Dynamical Distance Learning [5.797847756967884]
Reinforcement Learning (RL) agents can learn to solve complex sequential decision making tasks by interacting with the environment.
In the field of multi-goal RL, where agents are required to reach multiple goals to solve complex tasks, improving sample efficiency can be especially challenging.
We propose a method for automatic goal generation using a dynamical distance function (DDF) in a self-supervised fashion.
arXiv Detail & Related papers (2021-11-07T16:23:56Z) - Unsupervised Domain Adaptation with Dynamics-Aware Rewards in
Reinforcement Learning [28.808933152885874]
Unconditioned reinforcement learning aims to acquire skills without prior goal representations.
The intuitive approach of training in another interaction-rich environment disrupts the trained skills in the target environment.
We propose an unsupervised domain adaptation method to identify and acquire skills across dynamics.
arXiv Detail & Related papers (2021-10-25T14:40:48Z) - Follow the Object: Curriculum Learning for Manipulation Tasks with
Imagined Goals [8.98526174345299]
This paper introduces a notion of imaginary object goals.
For a given manipulation task, the object of interest is first trained to reach a desired target position on its own.
The object policy is then leveraged to build a predictive model of plausible object trajectories.
The proposed algorithm, Follow the Object, has been evaluated on 7 MuJoCo environments.
arXiv Detail & Related papers (2020-08-05T12:19:14Z) - Learning with AMIGo: Adversarially Motivated Intrinsic Goals [63.680207855344875]
AMIGo is a goal-generating teacher that proposes Adversarially Motivated Intrinsic Goals.
We show that our method generates a natural curriculum of self-proposed goals which ultimately allows the agent to solve challenging procedurally-generated tasks.
arXiv Detail & Related papers (2020-06-22T10:22:08Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.