SMAUG: A Sliding Multidimensional Task Window-Based MARL Framework for
Adaptive Real-Time Subtask Recognition
- URL: http://arxiv.org/abs/2403.01816v1
- Date: Mon, 4 Mar 2024 08:04:41 GMT
- Title: SMAUG: A Sliding Multidimensional Task Window-Based MARL Framework for
Adaptive Real-Time Subtask Recognition
- Authors: Wenjing Zhang, Wei Zhang
- Abstract summary: Subtask-based multi-agent reinforcement learning (MARL) methods enable agents to learn how to tackle different subtasks.
textbfSliding textbfMultidimensional ttextbfAsk window based mtextbfUti-agent reinforcement learnintextbfG framework (SMAUG) is proposed for adaptive real-time subtask recognition.
Experiments on StarCraft II show that SMAUG not only demonstrates performance superiority in comparison with all baselines but also presents a more prominent and swift rise in rewards
- Score: 11.236363226878975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instead of making behavioral decisions directly from the exponentially
expanding joint observational-action space, subtask-based multi-agent
reinforcement learning (MARL) methods enable agents to learn how to tackle
different subtasks. Most existing subtask-based MARL methods are based on
hierarchical reinforcement learning (HRL). However, these approaches often
limit the number of subtasks, perform subtask recognition periodically, and can
only identify and execute a specific subtask within the predefined fixed time
period, which makes them inflexible and not suitable for diverse and dynamic
scenarios with constantly changing subtasks. To break through above
restrictions, a \textbf{S}liding \textbf{M}ultidimensional t\textbf{A}sk window
based m\textbf{U}ti-agent reinforcement learnin\textbf{G} framework (SMAUG) is
proposed for adaptive real-time subtask recognition. It leverages a sliding
multidimensional task window to extract essential information of subtasks from
trajectory segments concatenated based on observed and predicted trajectories
in varying lengths. An inference network is designed to iteratively predict
future trajectories with the subtask-oriented policy network. Furthermore,
intrinsic motivation rewards are defined to promote subtask exploration and
behavior diversity. SMAUG can be integrated with any Q-learning-based approach.
Experiments on StarCraft II show that SMAUG not only demonstrates performance
superiority in comparison with all baselines but also presents a more prominent
and swift rise in rewards during the initial training stage.
Related papers
- Identifying Selections for Unsupervised Subtask Discovery [12.22188797558089]
We provide a theory to identify, and experiments to verify the existence of selection variables in data.
These selections serve as subgoals that indicate subtasks and guide policy.
In light of this idea, we develop a sequential non-negative matrix factorization (seq- NMF) method to learn these subgoals and extract meaningful behavior patterns as subtasks.
arXiv Detail & Related papers (2024-10-28T23:47:43Z) - TemPrompt: Multi-Task Prompt Learning for Temporal Relation Extraction in RAG-based Crowdsourcing Systems [21.312052922118585]
Temporal relation extraction (TRE) aims to grasp the evolution of events or actions, and thus shape the workflow of associated tasks.
We propose a multi-task prompt learning framework for TRE (TemPrompt), incorporating prompt tuning and contrastive learning to tackle these issues.
arXiv Detail & Related papers (2024-06-21T01:52:37Z) - ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt [67.8934749027315]
We propose a unified framework for graph hybrid pre-training which injects the task identification and position identification into GNNs.
We also propose a novel pre-training paradigm based on a group of $k$-nearest neighbors.
arXiv Detail & Related papers (2023-10-23T12:11:13Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Continual Object Detection via Prototypical Task Correlation Guided
Gating Mechanism [120.1998866178014]
We present a flexible framework for continual object detection via pRotOtypical taSk corrElaTion guided gaTingAnism (ROSETTA)
Concretely, a unified framework is shared by all tasks while task-aware gates are introduced to automatically select sub-models for specific tasks.
Experiments on COCO-VOC, KITTI-Kitchen, class-incremental detection on VOC and sequential learning of four tasks show that ROSETTA yields state-of-the-art performance.
arXiv Detail & Related papers (2022-05-06T07:31:28Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.