Trying AGAIN instead of Trying Longer: Prior Learning for Automatic
Curriculum Learning
- URL: http://arxiv.org/abs/2004.03168v1
- Date: Tue, 7 Apr 2020 07:30:27 GMT
- Title: Trying AGAIN instead of Trying Longer: Prior Learning for Automatic
Curriculum Learning
- Authors: R\'emy Portelas and Katja Hofmann and Pierre-Yves Oudeyer
- Abstract summary: A major challenge in the Deep RL (DRL) community is to train agents able to generalize over unseen situations.
We propose a two stage ACL approach where 1) a teacher algorithm first learns to train a DRL agent with a high-exploration curriculum, and then 2) distills learned priors from the first run to generate an "expert curriculum"
Besides demonstrating 50% improvements on average over the current state of the art, the objective of this work is to give a first example of a new research direction oriented towards refining ACL techniques over multiple learners.
- Score: 39.489869446313065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major challenge in the Deep RL (DRL) community is to train agents able to
generalize over unseen situations, which is often approached by training them
on a diversity of tasks (or environments). A powerful method to foster
diversity is to procedurally generate tasks by sampling their parameters from a
multi-dimensional distribution, enabling in particular to propose a different
task for each training episode. In practice, to get the high diversity of
training tasks necessary for generalization, one has to use complex procedural
generation systems. With such generators, it is hard to get prior knowledge on
the subset of tasks that are actually learnable at all (many generated tasks
may be unlearnable), what is their relative difficulty and what is the most
efficient task distribution ordering for training. A typical solution in such
cases is to rely on some form of Automated Curriculum Learning (ACL) to adapt
the sampling distribution. One limit of current approaches is their need to
explore the task space to detect progress niches over time, which leads to a
loss of time. Additionally, we hypothesize that the induced noise in the
training data may impair the performances of brittle DRL learners. We address
this problem by proposing a two stage ACL approach where 1) a teacher algorithm
first learns to train a DRL agent with a high-exploration curriculum, and then
2) distills learned priors from the first run to generate an "expert
curriculum" to re-train the same agent from scratch. Besides demonstrating 50%
improvements on average over the current state of the art, the objective of
this work is to give a first example of a new research direction oriented
towards refining ACL techniques over multiple learners, which we call Classroom
Teaching.
Related papers
- Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Understanding the Complexity Gains of Single-Task RL with a Curriculum [83.46923851724408]
Reinforcement learning (RL) problems can be challenging without well-shaped rewards.
We provide a theoretical framework that reformulates a single-task RL problem as a multi-task RL problem defined by a curriculum.
We show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem.
arXiv Detail & Related papers (2022-12-24T19:46:47Z) - CLUTR: Curriculum Learning via Unsupervised Task Representation Learning [130.79246770546413]
CLUTR is a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization.
We show CLUTR outperforms PAIRED, a principled and popular UED method, in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.
arXiv Detail & Related papers (2022-10-19T01:45:29Z) - Abstract Demonstrations and Adaptive Exploration for Efficient and
Stable Multi-step Sparse Reward Reinforcement Learning [44.968170318777105]
This paper proposes a DRL exploration technique, termed A2, which integrates two components inspired by human experiences: Abstract demonstrations and Adaptive exploration.
A2 starts by decomposing a complex task into subtasks, and then provides the correct orders of subtasks to learn.
We demonstrate that A2 can aid popular DRL algorithms to learn more efficiently and stably in these environments.
arXiv Detail & Related papers (2022-07-19T12:56:41Z) - Generalizing to New Tasks via One-Shot Compositional Subgoals [23.15624959305799]
The ability to generalize to previously unseen tasks with little to no supervision is a key challenge in modern machine learning research.
We introduce CASE which attempts to address these issues by training an Imitation Learning agent using adaptive "near future" subgoals.
Our experiments show that the proposed approach consistently outperforms the previous state-of-the-art compositional Imitation Learning approach by 30%.
arXiv Detail & Related papers (2022-05-16T14:30:11Z) - Variational Automatic Curriculum Learning for Sparse-Reward Cooperative
Multi-Agent Problems [42.973910399533054]
We introduce a curriculum learning algorithm, Variational Automatic Curriculum Learning (VACL), for solving cooperative multi-agent reinforcement learning problems.
Our VACL algorithm implements this variational paradigm with two practical components, task expansion and entity progression.
Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents.
arXiv Detail & Related papers (2021-11-08T16:35:08Z) - TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL [23.719833581321033]
Training autonomous agents able to generalize to multiple tasks is a key target of Deep Reinforcement Learning (DRL) research.
In parallel to improving DRL algorithms, Automatic Curriculum Learning (ACL) study how teacher algorithms can train DRL agents more efficiently by adapting task selection to their evolving abilities.
While multiple standard benchmarks exist to compare DRL agents, there is currently no such thing for ACL algorithms.
arXiv Detail & Related papers (2021-03-17T17:59:22Z) - Meta Automatic Curriculum Learning [35.13646854355393]
We introduce the concept of Meta-ACL, and formalize it in the context of black-box RL learners.
We present AGAIN, a first instantiation of Meta-ACL, and showcase its benefits for curriculum generation over classical ACL.
arXiv Detail & Related papers (2020-11-16T14:56:42Z) - Generalized Hindsight for Reinforcement Learning [154.0545226284078]
We argue that low-reward data collected while trying to solve one task provides little to no signal for solving that particular task.
We present Generalized Hindsight: an approximate inverse reinforcement learning technique for relabeling behaviors with the right tasks.
arXiv Detail & Related papers (2020-02-26T18:57:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.