Robust Knowledge Transfer in Tiered Reinforcement Learning
- URL: http://arxiv.org/abs/2302.05534v3
- Date: Thu, 13 Jun 2024 14:54:41 GMT
- Title: Robust Knowledge Transfer in Tiered Reinforcement Learning
- Authors: Jiawei Huang, Niao He,
- Abstract summary: We study the Tiered Reinforcement Learning setting, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task.
Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions.
We propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity.
- Score: 22.303882476904295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions, and focus on robust knowledge transfer without prior knowledge on the task similarity. We identify a natural and necessary condition called the ``Optimal Value Dominance'' for our objective. Under this condition, we propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity and retain near-optimal regret when the two tasks are dissimilar, while for the low-tier task, it can keep near-optimal without making sacrifice. Moreover, we further study the setting with multiple low-tier tasks, and propose a novel transfer source selection mechanism, which can ensemble the information from all low-tier tasks and allow provable benefits on a much larger state-action space.
Related papers
- Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation [0.0]
The Task-Agnostic Policy Distillation (TAPD) framework is introduced.
This paper addresses the problem of continual learning.
By utilizing task-agnostic distilled knowledge, the agent can solve downstream tasks more efficiently.
arXiv Detail & Related papers (2024-11-25T16:18:39Z) - Multitask Learning with No Regret: from Improved Confidence Bounds to
Active Learning [79.07658065326592]
Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning.
We provide novel multitask confidence intervals in the challenging setting when neither the similarity between tasks nor the tasks' features are available to the learner.
We propose a novel online learning algorithm that achieves such improved regret without knowing this parameter in advance.
arXiv Detail & Related papers (2023-08-03T13:08:09Z) - Understanding the Transferability of Representations via Task-Relatedness [8.425690424016986]
We propose a novel analysis that analyzes the transferability of the representations of pre-trained models to downstream tasks in terms of their relatedness to a given reference task.
Our experiments using state-of-the-art pre-trained models show the effectiveness of task-relatedness in explaining transferability on various vision and language tasks.
arXiv Detail & Related papers (2023-07-03T08:06:22Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Transferring Knowledge for Reinforcement Learning in Contact-Rich
Manipulation [10.219833196479142]
We address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors.
Our method learns a latent action space representing the skill embedding from demonstrated trajectories for each prior task.
We have evaluated our method on a set of peg-in-hole insertion tasks and demonstrate better generalization to new tasks that have never been encountered during training.
arXiv Detail & Related papers (2022-09-19T10:31:13Z) - A Unified Meta-Learning Framework for Dynamic Transfer Learning [42.34180707803632]
We propose a generic meta-learning framework L2E for modeling the knowledge transferability on dynamic tasks.
L2E enjoys the following properties: (1) effective knowledge transferability across dynamic tasks; (2) fast adaptation to the new target task; (3) mitigation of catastrophic forgetting on historical target tasks; and (4) flexibility in incorporating any existing static transfer learning algorithms.
arXiv Detail & Related papers (2022-07-05T02:56:38Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Latent Skill Planning for Exploration and Transfer [49.25525932162891]
In this paper, we investigate how these two approaches can be integrated into a single reinforcement learning agent.
We leverage the idea of partial amortization for fast adaptation at test time.
We demonstrate the benefits of our design decisions across a suite of challenging locomotion tasks.
arXiv Detail & Related papers (2020-11-27T18:40:03Z) - Measuring and Harnessing Transference in Multi-Task Learning [58.48659733262734]
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
We analyze the dynamics of information transfer, or transference, across tasks throughout training.
arXiv Detail & Related papers (2020-10-29T08:25:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.