Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks
- URL: http://arxiv.org/abs/2403.01636v2
- Date: Wed, 6 Mar 2024 04:34:01 GMT
- Title: Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks
- Authors: Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari
- Abstract summary: This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
- Score: 53.44714413181162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multitask Reinforcement Learning (MTRL) approaches have gained increasing
attention for its wide applications in many important Reinforcement Learning
(RL) tasks. However, while recent advancements in MTRL theory have focused on
the improved statistical efficiency by assuming a shared structure across
tasks, exploration--a crucial aspect of RL--has been largely overlooked. This
paper addresses this gap by showing that when an agent is trained on a
sufficiently diverse set of tasks, a generic policy-sharing algorithm with
myopic exploration design like $\epsilon$-greedy that are inefficient in
general can be sample-efficient for MTRL. To the best of our knowledge, this is
the first theoretical demonstration of the "exploration benefits" of MTRL. It
may also shed light on the enigmatic success of the wide applications of myopic
exploration in practice. To validate the role of diversity, we conduct
experiments on synthetic robotic control environments, where the diverse task
set aligns with the task selection by automatic curriculum learning, which is
empirically shown to improve sample-efficiency.
Related papers
- M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL)
Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms.
We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Provable Benefits of Multi-task RL under Non-Markovian Decision Making
Processes [56.714690083118406]
In multi-task reinforcement learning (RL) under Markov decision processes (MDPs), the presence of shared latent structures has been shown to yield significant benefits to the sample efficiency compared to single-task RL.
We investigate whether such a benefit can extend to more general sequential decision making problems, such as partially observable MDPs (POMDPs) and more general predictive state representations (PSRs)
We propose a provably efficient algorithm UMT-PSR for finding near-optimal policies for all PSRs, and demonstrate that the advantage of multi-task learning manifests if the joint model class of PSR
arXiv Detail & Related papers (2023-10-20T14:50:28Z) - Equitable Multi-task Learning [18.65048321820911]
Multi-task learning (MTL) has achieved great success in various research domains, such as CV, NLP and IR.
We propose a novel multi-task optimization method, named EMTL, to achieve equitable MTL.
Our method stably outperforms state-of-the-art methods on the public benchmark datasets of two different research domains.
arXiv Detail & Related papers (2023-06-15T03:37:23Z) - Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms.
It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL.
This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z) - Understanding the Complexity Gains of Single-Task RL with a Curriculum [83.46923851724408]
Reinforcement learning (RL) problems can be challenging without well-shaped rewards.
We provide a theoretical framework that reformulates a single-task RL problem as a multi-task RL problem defined by a curriculum.
We show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem.
arXiv Detail & Related papers (2022-12-24T19:46:47Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Improved Context-Based Offline Meta-RL with Attention and Contrastive
Learning [1.3106063755117399]
We improve upon one of the SOTA OMRL algorithms, FOCAL, by incorporating intra-task attention mechanism and inter-task contrastive learning objectives.
Theoretical analysis and experiments are presented to demonstrate the superior performance, efficiency and robustness of our end-to-end and model free method.
arXiv Detail & Related papers (2021-02-22T05:05:16Z) - Efficient Reinforcement Learning in Resource Allocation Problems Through
Permutation Invariant Multi-task Learning [6.247939901619901]
We show that in certain settings, the available data can be dramatically increased through a form of multi-task learning.
We provide a theoretical performance bound for the gain in sample efficiency under this setting.
This motivates a new approach to multi-task learning, which involves the design of an appropriate neural network architecture and a prioritized task-sampling strategy.
arXiv Detail & Related papers (2021-02-18T14:13:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.