REPAINT: Knowledge Transfer in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2011.11827v3
- Date: Wed, 26 May 2021 05:25:23 GMT
- Title: REPAINT: Knowledge Transfer in Deep Reinforcement Learning
- Authors: Yunzhe Tao, Sahika Genc, Jonathan Chung, Tao Sun, Sunil Mallya
- Abstract summary: This work proposes REPresentation And IN Transfer (REPAINT) algorithm for knowledge transfer in deep reinforcement learning.
REPAINT not only transfers the representation of a pre-trained teacher policy in the on-policy learning, but also uses an advantage-based experience selection approach to transfer useful samples collected following the teacher policy in the off-policy learning.
- Score: 13.36223726517518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accelerating learning processes for complex tasks by leveraging previously
learned tasks has been one of the most challenging problems in reinforcement
learning, especially when the similarity between source and target tasks is
low. This work proposes REPresentation And INstance Transfer (REPAINT)
algorithm for knowledge transfer in deep reinforcement learning. REPAINT not
only transfers the representation of a pre-trained teacher policy in the
on-policy learning, but also uses an advantage-based experience selection
approach to transfer useful samples collected following the teacher policy in
the off-policy learning. Our experimental results on several benchmark tasks
show that REPAINT significantly reduces the total training time in generic
cases of task similarity. In particular, when the source tasks are dissimilar
to, or sub-tasks of, the target tasks, REPAINT outperforms other baselines in
both training-time reduction and asymptotic performance of return scores.
Related papers
- Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning [17.236861687708096]
Attention-Guided Incremental Learning' (AGILE) is a rehearsal-based CL approach that incorporates compact task attention to effectively reduce interference between tasks.
AGILE significantly improves generalization performance by mitigating task interference and outperforming rehearsal-based approaches in several CL scenarios.
arXiv Detail & Related papers (2024-05-22T20:29:15Z) - Sharing Knowledge in Multi-Task Deep Reinforcement Learning [57.38874587065694]
We study the benefit of sharing representations among tasks to enable the effective use of deep neural networks in Multi-Task Reinforcement Learning.
We prove this by providing theoretical guarantees that highlight the conditions for which is convenient to share representations among tasks.
arXiv Detail & Related papers (2024-01-17T19:31:21Z) - Replay-enhanced Continual Reinforcement Learning [37.34722105058351]
We introduce RECALL, a replay-enhanced method that greatly improves the plasticity of existing replay-based methods on new tasks.
Experiments on the Continual World benchmark show that RECALL performs significantly better than purely perfect memory replay.
arXiv Detail & Related papers (2023-11-20T06:21:52Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Provable Benefits of Representational Transfer in Reinforcement Learning [59.712501044999875]
We study the problem of representational transfer in RL, where an agent first pretrains in a number of source tasks to discover a shared representation.
We show that given generative access to source tasks, we can discover a representation, using which subsequent linear RL techniques quickly converge to a near-optimal policy.
arXiv Detail & Related papers (2022-05-29T04:31:29Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Return-Based Contrastive Representation Learning for Reinforcement
Learning [126.7440353288838]
We propose a novel auxiliary task that forces the learnt representations to discriminate state-action pairs with different returns.
Our algorithm outperforms strong baselines on complex tasks in Atari games and DeepMind Control suite.
arXiv Detail & Related papers (2021-02-22T13:04:18Z) - Learning Invariant Representation for Continual Learning [5.979373021392084]
A key challenge in Continual learning is catastrophically forgetting previously learned tasks when the agent faces a new one.
We propose a new pseudo-rehearsal-based method, named learning Invariant Representation for Continual Learning (IRCL)
Disentangling the shared invariant representation helps to learn continually a sequence of tasks, while being more robust to forgetting and having better knowledge transfer.
arXiv Detail & Related papers (2021-01-15T15:12:51Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.