Introspective Action Advising for Interpretable Transfer Learning
- URL: http://arxiv.org/abs/2306.12314v1
- Date: Wed, 21 Jun 2023 14:53:33 GMT
- Title: Introspective Action Advising for Interpretable Transfer Learning
- Authors: Joseph Campbell, Yue Guo, Fiona Xie, Simon Stepputtis, Katia Sycara
- Abstract summary: Transfer learning can be applied in deep reinforcement learning to accelerate the training of a policy in a target task.
We propose an alternative approach to transfer learning between tasks based on action advising, in which a teacher trained in a source task actively guides a student's exploration in a target task.
- Score: 7.673465837624365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning can be applied in deep reinforcement learning to accelerate
the training of a policy in a target task by transferring knowledge from a
policy learned in a related source task. This is commonly achieved by copying
pretrained weights from the source policy to the target policy prior to
training, under the constraint that they use the same model architecture.
However, not only does this require a robust representation learned over a wide
distribution of states -- often failing to transfer between specialist models
trained over single tasks -- but it is largely uninterpretable and provides
little indication of what knowledge is transferred. In this work, we propose an
alternative approach to transfer learning between tasks based on action
advising, in which a teacher trained in a source task actively guides a
student's exploration in a target task. Through introspection, the teacher is
capable of identifying when advice is beneficial to the student and should be
given, and when it is not. Our approach allows knowledge transfer between
policies agnostic of the underlying representations, and we empirically show
that this leads to improved convergence rates in Gridworld and Atari
environments while providing insight into what knowledge is transferred.
Related papers
- Similarity-based Knowledge Transfer for Cross-Domain Reinforcement
Learning [3.3148826359547523]
We develop a semi-supervised alignment loss to match different spaces with a set of encoder-decoders.
In comparison to prior works, our method does not require data to be aligned, paired or collected by expert policies.
arXiv Detail & Related papers (2023-12-05T19:26:01Z) - IOB: Integrating Optimization Transfer and Behavior Transfer for
Multi-Policy Reuse [50.90781542323258]
Reinforcement learning (RL) agents can transfer knowledge from source policies to a related target task.
Previous methods introduce additional components, such as hierarchical policies or estimations of source policies' value functions.
We propose a novel transfer RL method that selects the source policy without training extra components.
arXiv Detail & Related papers (2023-08-14T09:22:35Z) - Provable Benefits of Representational Transfer in Reinforcement Learning [59.712501044999875]
We study the problem of representational transfer in RL, where an agent first pretrains in a number of source tasks to discover a shared representation.
We show that given generative access to source tasks, we can discover a representation, using which subsequent linear RL techniques quickly converge to a near-optimal policy.
arXiv Detail & Related papers (2022-05-29T04:31:29Z) - Multi-Source Transfer Learning for Deep Model-Based Reinforcement
Learning [0.6445605125467572]
A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task.
Transfer learning proposes to address this issue by re-using knowledge from previously learned tasks.
The goal of this paper is to address these issues with modular multi-source transfer learning techniques.
arXiv Detail & Related papers (2022-05-28T12:04:52Z) - Continual Prompt Tuning for Dialog State Tracking [58.66412648276873]
A desirable dialog system should be able to continually learn new skills without forgetting old ones.
We present Continual Prompt Tuning, a parameter-efficient framework that not only avoids forgetting but also enables knowledge transfer between tasks.
arXiv Detail & Related papers (2022-03-13T13:22:41Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - Adaptive Policy Transfer in Reinforcement Learning [9.594432031144715]
We introduce a principled mechanism that can "Adapt-to-Learn", that is adapt the source policy to learn to solve a target task.
We show that the presented method learns to seamlessly combine learning from adaptation and exploration and leads to a robust policy transfer algorithm.
arXiv Detail & Related papers (2021-05-10T22:42:03Z) - Coverage as a Principle for Discovering Transferable Behavior in
Reinforcement Learning [16.12658895065585]
We argue that representation alone is not enough for efficient transfer in challenging domains and explore how to transfer knowledge through behavior.
The behavior of pre-trained policies may be used for solving the task at hand (exploitation) or for collecting useful data to solve the problem (exploration)
arXiv Detail & Related papers (2021-02-24T16:51:02Z) - Efficient Deep Reinforcement Learning via Adaptive Policy Transfer [50.51637231309424]
Policy Transfer Framework (PTF) is proposed to accelerate Reinforcement Learning (RL)
Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it.
Experimental results show it significantly accelerates the learning process and surpasses state-of-the-art policy transfer methods.
arXiv Detail & Related papers (2020-02-19T07:30:57Z) - Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model
Distillation Approach [55.83558520598304]
We propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation.
We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge.
Our proposed framework, namely Learning and Teaching Categorical Reinforcement, shows promising performance on stabilizing and accelerating learning progress.
arXiv Detail & Related papers (2020-02-06T11:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.