Mutual Information Based Knowledge Transfer Under State-Action Dimension
Mismatch
- URL: http://arxiv.org/abs/2006.07041v1
- Date: Fri, 12 Jun 2020 09:51:17 GMT
- Title: Mutual Information Based Knowledge Transfer Under State-Action Dimension
Mismatch
- Authors: Michael Wan, Tanmay Gangwani, Jian Peng
- Abstract summary: We propose a new framework for transfer learning where the teacher and the student can have arbitrarily different state- and action-spaces.
To handle this mismatch, we produce embeddings which can systematically extract knowledge from the teacher policy and value networks.
We demonstrate successful transfer learning in situations when the teacher and student have different state- and action-spaces.
- Score: 14.334987432342707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (RL) algorithms have achieved great success on a
wide variety of sequential decision-making tasks. However, many of these
algorithms suffer from high sample complexity when learning from scratch using
environmental rewards, due to issues such as credit-assignment and
high-variance gradients, among others. Transfer learning, in which knowledge
gained on a source task is applied to more efficiently learn a different but
related target task, is a promising approach to improve the sample complexity
in RL. Prior work has considered using pre-trained teacher policies to enhance
the learning of the student policy, albeit with the constraint that the teacher
and the student MDPs share the state-space or the action-space. In this paper,
we propose a new framework for transfer learning where the teacher and the
student can have arbitrarily different state- and action-spaces. To handle this
mismatch, we produce embeddings which can systematically extract knowledge from
the teacher policy and value networks, and blend it into the student networks.
To train the embeddings, we use a task-aligned loss and show that the
representations could be enriched further by adding a mutual information loss.
Using a set of challenging simulated robotic locomotion tasks involving
many-legged centipedes, we demonstrate successful transfer learning in
situations when the teacher and student have different state- and
action-spaces.
Related papers
- Multitask Learning with No Regret: from Improved Confidence Bounds to
Active Learning [79.07658065326592]
Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning.
We provide novel multitask confidence intervals in the challenging setting when neither the similarity between tasks nor the tasks' features are available to the learner.
We propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance.
arXiv Detail & Related papers (2023-08-03T13:08:09Z) - Evaluating the structure of cognitive tasks with transfer learning [67.22168759751541]
This study investigates the transferability of deep learning representations between different EEG decoding tasks.
We conduct extensive experiments using state-of-the-art decoding models on two recently released EEG datasets.
arXiv Detail & Related papers (2023-07-28T14:51:09Z) - NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision
Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks.
Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth.
Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z) - Teacher-student curriculum learning for reinforcement learning [1.7259824817932292]
Reinforcement learning (rl) is a popular paradigm for sequential decision making problems.
The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems.
We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
arXiv Detail & Related papers (2022-10-31T14:45:39Z) - Class-Incremental Learning via Knowledge Amalgamation [14.513858688486701]
Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting.
We put forward an alternative strategy to handle the catastrophic forgetting with knowledge amalgamation (CFA)
CFA learns a student network from multiple heterogeneous teacher models specializing in previous tasks and can be applied to current offline methods.
arXiv Detail & Related papers (2022-09-05T19:49:01Z) - Learning Multi-Task Transferable Rewards via Variational Inverse
Reinforcement Learning [10.782043595405831]
We extend an empowerment-based regularization technique to situations with multiple tasks based on the framework of a generative adversarial network.
Under the multitask environments with unknown dynamics, we focus on learning a reward and policy from unlabeled expert examples.
Our proposed method derives the variational lower bound of the situational mutual information to optimize it.
arXiv Detail & Related papers (2022-06-19T22:32:41Z) - Multi-Source Transfer Learning for Deep Model-Based Reinforcement
Learning [0.6445605125467572]
A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task.
Transfer learning proposes to address this issue by re-using knowledge from previously learned tasks.
The goal of this paper is to address these issues with modular multi-source transfer learning techniques.
arXiv Detail & Related papers (2022-05-28T12:04:52Z) - Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation [51.21190751266442]
Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data.
By learning from large-scale unlabeled samples, self-supervised learning has now become a new trend in deep learning.
We propose a novel textbfSelf-textbfSupervised textbfGraph Neural Network (SSG) to enable more effective inter-task information exchange and knowledge sharing.
arXiv Detail & Related papers (2022-04-08T03:37:56Z) - Transferability in Deep Learning: A Survey [80.67296873915176]
The ability to acquire and reuse knowledge is known as transferability in deep learning.
We present this survey to connect different isolated areas in deep learning with their relation to transferability.
We implement a benchmark and an open-source library, enabling a fair evaluation of deep learning methods in terms of transferability.
arXiv Detail & Related papers (2022-01-15T15:03:17Z) - Learning from Guided Play: A Scheduled Hierarchical Approach for
Improving Exploration in Adversarial Imitation Learning [7.51557557629519]
We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of, in addition to a main task, multiple auxiliary tasks.
This affords many benefits: learning efficiency is improved for main tasks with challenging bottleneck transitions, expert data becomes reusable between tasks, and transfer learning through the reuse of learned auxiliary task models becomes possible.
arXiv Detail & Related papers (2021-12-16T14:58:08Z) - Curriculum Learning for Reinforcement Learning Domains: A Framework and
Survey [53.73359052511171]
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.
We present a framework for curriculum learning (CL) in RL, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals.
arXiv Detail & Related papers (2020-03-10T20:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.