Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2511.02304v1
- Date: Tue, 04 Nov 2025 06:37:36 GMT
- Title: Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning
- Authors: Beyazit Yalcinkaya, Marcell Vazquez-Chanlatte, Ameesh Shah, Hanna Krasowski, Sanjit A. Seshia,
- Abstract summary: We study the problem of learning multi-task, multi-agent policies for cooperative, temporal objectives, under centralized training, decentralized execution.<n>We present Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning (ACC-MARL), a framework for learning task-conditioned, decentralized team policies.
- Score: 6.085436102697102
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of learning multi-task, multi-agent policies for cooperative, temporal objectives, under centralized training, decentralized execution. In this setting, using automata to represent tasks enables the decomposition of complex tasks into simpler sub-tasks that can be assigned to agents. However, existing approaches remain sample-inefficient and are limited to the single-task case. In this work, we present Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning (ACC-MARL), a framework for learning task-conditioned, decentralized team policies. We identify the main challenges to ACC-MARL's feasibility in practice, propose solutions, and prove the correctness of our approach. We further show that the value functions of learned policies can be used to assign tasks optimally at test time. Experiments show emergent task-aware, multi-step coordination among agents, e.g., pressing a button to unlock a door, holding the door, and short-circuiting tasks.
Related papers
- Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information [6.445203664352597]
We study how providing high-level symbolic knowledge to agents can help address unique challenges of this setting.<n>In particular, we extend the formal tools used to check the compatibility of local policies with the team task.<n>We empirically demonstrate that symbolic knowledge about the temporal evolution of events in the environment can significantly expedite the learning process in DMARL.
arXiv Detail & Related papers (2025-06-09T14:53:03Z) - Cross-Task Experiential Learning on LLM-based Multi-Agent Collaboration [63.90193684394165]
We introduce multi-agent cross-task experiential learning (MAEL), a novel framework that endows LLM-driven agents with explicit cross-task learning and experience accumulation.<n>During the experiential learning phase, we quantify the quality for each step in the task-solving workflow and store the resulting rewards.<n>During inference, agents retrieve high-reward, task-relevant experiences as few-shot examples to enhance the effectiveness of each reasoning step.
arXiv Detail & Related papers (2025-05-29T07:24:37Z) - Active Fine-Tuning of Multi-Task Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget.<n>We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z) - Variational Offline Multi-agent Skill Discovery [47.924414207796005]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills.<n>Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining.<n> Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing hierarchical multi-agent reinforcement learning (MARL) methods.
arXiv Detail & Related papers (2024-05-26T00:24:46Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Learning Complex Teamwork Tasks Using a Given Sub-task Decomposition [11.998708550268978]
We propose an approach which uses an expert-provided decomposition of a task into simpler multi-agent sub-tasks.
In each sub-task, a subset of the entire team is trained to acquire sub-task-specific policies.
The sub-teams are then merged and transferred to the target task, where their policies are collectively fine-tuned to solve the more complex target task.
arXiv Detail & Related papers (2023-02-09T21:24:56Z) - Learning Task Embeddings for Teamwork Adaptation in Multi-Agent
Reinforcement Learning [13.468555224407764]
We show that a team of agents is able to adapt to novel tasks when provided with task embeddings.
We propose three MATE training paradigms: independent MATE, centralised MATE, and mixed MATE.
We show that the embeddings learned by MATE identify tasks and provide useful information which agents leverage during adaptation to novel tasks.
arXiv Detail & Related papers (2022-07-05T18:23:20Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Modular Adaptive Policy Selection for Multi-Task Imitation Learning
through Task Division [60.232542918414985]
Multi-task learning often suffers from negative transfer, sharing information that should be task-specific.
This is done by using proto-policies as modules to divide the tasks into simple sub-behaviours that can be shared.
We also demonstrate its ability to autonomously divide the tasks into both shared and task-specific sub-behaviours.
arXiv Detail & Related papers (2022-03-28T15:53:17Z) - Multi-Agent Policy Transfer via Task Relationship Modeling [28.421365805638953]
We try to discover and exploit common structures among tasks for more efficient transfer.
We propose to learn effect-based task representations as a common space of tasks, using an alternatively fixed training scheme.
As a result, the proposed method can help transfer learned cooperation knowledge to new tasks after training on a few source tasks.
arXiv Detail & Related papers (2022-03-09T01:49:21Z) - Reward Machines for Cooperative Multi-Agent Reinforcement Learning [30.84689303706561]
In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal.
We propose the use of reward machines (RM) -- Mealy machines used as structured representations of reward functions -- to encode the team's task.
The proposed novel interpretation of RMs in the multi-agent setting explicitly encodes required teammate interdependencies, allowing the team-level task to be decomposed into sub-tasks for individual agents.
arXiv Detail & Related papers (2020-07-03T23:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.