Teacher-student curriculum learning for reinforcement learning
- URL: http://arxiv.org/abs/2210.17368v1
- Date: Mon, 31 Oct 2022 14:45:39 GMT
- Title: Teacher-student curriculum learning for reinforcement learning
- Authors: Yanick Schraner
- Abstract summary: Reinforcement learning (rl) is a popular paradigm for sequential decision making problems.
The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems.
We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
- Score: 1.7259824817932292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (rl) is a popular paradigm for sequential decision
making problems. The past decade's advances in rl have led to breakthroughs in
many challenging domains such as video games, board games, robotics, and chip
design. The sample inefficiency of deep reinforcement learning methods is a
significant obstacle when applying rl to real-world problems. Transfer learning
has been applied to reinforcement learning such that the knowledge gained in
one task can be applied when training in a new task. Curriculum learning is
concerned with sequencing tasks or data samples such that knowledge can be
transferred between those tasks to learn a target task that would otherwise be
too difficult to solve. Designing a curriculum that improves sample efficiency
is a complex problem. In this thesis, we propose a teacher-student curriculum
learning setting where we simultaneously train a teacher that selects tasks for
the student while the student learns how to solve the selected task. Our method
is independent of human domain knowledge and manual curriculum design. We
evaluated our methods on two reinforcement learning benchmarks: grid world and
the challenging Google Football environment. With our method, we can improve
the sample efficiency and generality of the student compared to tabula-rasa
reinforcement learning.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - Efficient Mitigation of Bus Bunching through Setter-Based Curriculum Learning [0.47518865271427785]
We propose a novel approach to curriculum learning that uses a Setter Model to automatically generate an action space, adversary strength, and bunching strength.
Our method for automated curriculum learning involves a curriculum that is dynamically chosen and learned by an adversary network.
arXiv Detail & Related papers (2024-05-23T18:26:55Z) - YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - Multitask Learning with No Regret: from Improved Confidence Bounds to
Active Learning [79.07658065326592]
Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning.
We provide novel multitask confidence intervals in the challenging setting when neither the similarity between tasks nor the tasks' features are available to the learner.
We propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance.
arXiv Detail & Related papers (2023-08-03T13:08:09Z) - Transferring Knowledge for Reinforcement Learning in Contact-Rich
Manipulation [10.219833196479142]
We address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors.
Our method learns a latent action space representing the skill embedding from demonstrated trajectories for each prior task.
We have evaluated our method on a set of peg-in-hole insertion tasks and demonstrate better generalization to new tasks that have never been encountered during training.
arXiv Detail & Related papers (2022-09-19T10:31:13Z) - Transferability in Deep Learning: A Survey [80.67296873915176]
The ability to acquire and reuse knowledge is known as transferability in deep learning.
We present this survey to connect different isolated areas in deep learning with their relation to transferability.
We implement a benchmark and an open-source library, enabling a fair evaluation of deep learning methods in terms of transferability.
arXiv Detail & Related papers (2022-01-15T15:03:17Z) - Efficiently Identifying Task Groupings for Multi-Task Learning [55.80489920205404]
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
We suggest an approach to select which tasks should train together in multi-task learning models.
Our method determines task groupings in a single training run by co-training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss.
arXiv Detail & Related papers (2021-09-10T02:01:43Z) - Multi-task curriculum learning in a complex, visual, hard-exploration
domain: Minecraft [18.845438529816004]
We explore curriculum learning in a complex, visual domain with many hard exploration challenges: Minecraft.
We find that learning progress is a reliable measure of learnability for automatically constructing an effective curriculum.
arXiv Detail & Related papers (2021-06-28T17:50:40Z) - Curriculum Learning with Hindsight Experience Replay for Sequential
Object Manipulation Tasks [1.370633147306388]
We present an algorithm that combines curriculum learning with Hindsight Experience Replay (HER) to learn sequential object manipulation tasks.
The algorithm exploits the recurrent structure inherent in many object manipulation tasks and implements the entire learning process in the original simulation without adjusting it to each source task.
arXiv Detail & Related papers (2020-08-21T08:59:28Z) - Mutual Information Based Knowledge Transfer Under State-Action Dimension
Mismatch [14.334987432342707]
We propose a new framework for transfer learning where the teacher and the student can have arbitrarily different state- and action-spaces.
To handle this mismatch, we produce embeddings which can systematically extract knowledge from the teacher policy and value networks.
We demonstrate successful transfer learning in situations when the teacher and student have different state- and action-spaces.
arXiv Detail & Related papers (2020-06-12T09:51:17Z) - Curriculum Learning for Reinforcement Learning Domains: A Framework and
Survey [53.73359052511171]
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.
We present a framework for curriculum learning (CL) in RL, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals.
arXiv Detail & Related papers (2020-03-10T20:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.