Leveraging Temporally Extended Behavior Sharing for Multi-task Reinforcement Learning
- URL: http://arxiv.org/abs/2509.20766v2
- Date: Mon, 29 Sep 2025 00:16:21 GMT
- Title: Leveraging Temporally Extended Behavior Sharing for Multi-task Reinforcement Learning
- Authors: Gawon Lee, Daesol Cho, H. Jin Kim,
- Abstract summary: Multi-task reinforcement learning (MTRL) offers a promising approach to improve sample efficiency.<n>Applying MTRL to robotics remains challenging due to the high cost of collecting diverse task data.<n>We propose MT-L'evy, a novel exploration strategy that enhances sample efficiency in MTRL environments.
- Score: 24.95250207680073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-task reinforcement learning (MTRL) offers a promising approach to improve sample efficiency and generalization by training agents across multiple tasks, enabling knowledge sharing between them. However, applying MTRL to robotics remains challenging due to the high cost of collecting diverse task data. To address this, we propose MT-L\'evy, a novel exploration strategy that enhances sample efficiency in MTRL environments by combining behavior sharing across tasks with temporally extended exploration inspired by L\'evy flight. MT-L\'evy leverages policies trained on related tasks to guide exploration towards key states, while dynamically adjusting exploration levels based on task success ratios. This approach enables more efficient state-space coverage, even in complex robotics environments. Empirical results demonstrate that MT-L\'evy significantly improves exploration and sample efficiency, supported by quantitative and qualitative analyses. Ablation studies further highlight the contribution of each component, showing that combining behavior sharing with adaptive exploration strategies can significantly improve the practicality of MTRL in robotics applications.
Related papers
- Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning [27.472039054277644]
Rep-MTL exploits the representation-level task saliency to quantify interactions between task-specific optimization and shared representation learning.<n>Rep-MTL aims to mitigate negative transfer by maintaining the effective training of individual tasks instead pure conflict-solving.
arXiv Detail & Related papers (2025-07-28T17:59:28Z) - Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning [7.36961322800571]
deep Multi-Agent Reinforcement Learning (MARL) has demonstrated its potential to tackle complex cooperative tasks.<n>We introduce a novel sample reuse approach that dynamically adjusts policy updates based on observation novelty.<n>We name our method Multi-Agent Novelty-GuidEd sample Reuse (MANGER)
arXiv Detail & Related papers (2024-12-20T03:09:18Z) - Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z) - Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents [9.529492371336286]
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors.
We propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS)
LSTS learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification.
arXiv Detail & Related papers (2024-02-06T04:00:21Z) - Provable Benefits of Multi-task RL under Non-Markovian Decision Making
Processes [56.714690083118406]
In multi-task reinforcement learning (RL) under Markov decision processes (MDPs), the presence of shared latent structures has been shown to yield significant benefits to the sample efficiency compared to single-task RL.
We investigate whether such a benefit can extend to more general sequential decision making problems, such as partially observable MDPs (POMDPs) and more general predictive state representations (PSRs)
We propose a provably efficient algorithm UMT-PSR for finding near-optimal policies for all PSRs, and demonstrate that the advantage of multi-task learning manifests if the joint model class of PSR
arXiv Detail & Related papers (2023-10-20T14:50:28Z) - Equitable Multi-task Learning [18.65048321820911]
Multi-task learning (MTL) has achieved great success in various research domains, such as CV, NLP and IR.
We propose a novel multi-task optimization method, named EMTL, to achieve equitable MTL.
Our method stably outperforms state-of-the-art methods on the public benchmark datasets of two different research domains.
arXiv Detail & Related papers (2023-06-15T03:37:23Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing [18.127823952220123]
Multi-task reinforcement learning (MTRL) aims to learn several tasks simultaneously for better sample efficiency than learning them separately.<n>We introduce a new framework for sharing behavioral policies across tasks, which can be used in addition to existing MTRL methods.
arXiv Detail & Related papers (2023-02-01T18:58:20Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - Skill-based Meta-Reinforcement Learning [65.31995608339962]
We devise a method that enables meta-learning on long-horizon, sparse-reward tasks.
Our core idea is to leverage prior experience extracted from offline datasets during meta-learning.
arXiv Detail & Related papers (2022-04-25T17:58:19Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.