Learning Task Embeddings for Teamwork Adaptation in Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2207.02249v2
- Date: Mon, 20 Nov 2023 17:40:06 GMT
- Title: Learning Task Embeddings for Teamwork Adaptation in Multi-Agent
Reinforcement Learning
- Authors: Lukas Sch\"afer, Filippos Christianos, Amos Storkey, Stefano V.
Albrecht
- Abstract summary: We show that a team of agents is able to adapt to novel tasks when provided with task embeddings.
We propose three MATE training paradigms: independent MATE, centralised MATE, and mixed MATE.
We show that the embeddings learned by MATE identify tasks and provide useful information which agents leverage during adaptation to novel tasks.
- Score: 13.468555224407764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Successful deployment of multi-agent reinforcement learning often requires
agents to adapt their behaviour. In this work, we discuss the problem of
teamwork adaptation in which a team of agents needs to adapt their policies to
solve novel tasks with limited fine-tuning. Motivated by the intuition that
agents need to be able to identify and distinguish tasks in order to adapt
their behaviour to the current task, we propose to learn multi-agent task
embeddings (MATE). These task embeddings are trained using an encoder-decoder
architecture optimised for reconstruction of the transition and reward
functions which uniquely identify tasks. We show that a team of agents is able
to adapt to novel tasks when provided with task embeddings. We propose three
MATE training paradigms: independent MATE, centralised MATE, and mixed MATE
which vary in the information used for the task encoding. We show that the
embeddings learned by MATE identify tasks and provide useful information which
agents leverage during adaptation to novel tasks.
Related papers
- PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning [28.353530290015794]
We propose PEMT, a novel parameter-efficient fine-tuning framework based on multi-task transfer learning.
We conduct experiments on a broad range of tasks over 17 datasets.
arXiv Detail & Related papers (2024-02-23T03:59:18Z) - Active Instruction Tuning: Improving Cross-Task Generalization by
Training on Prompt Sensitive Tasks [101.40633115037983]
Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions.
How to select new tasks to improve the performance and generalizability of IT models remains an open question.
We propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks.
arXiv Detail & Related papers (2023-11-01T04:40:05Z) - Language-guided Task Adaptation for Imitation Learning [40.1007184209417]
We introduce a novel setting, wherein an agent needs to learn a task from a demonstration of a related task with the difference between the tasks communicated in natural language.
The proposed setting allows reusing demonstrations from other tasks, by providing low effort language descriptions, and can also be used to provide feedback to correct agent errors.
arXiv Detail & Related papers (2023-01-24T00:56:43Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Modular Adaptive Policy Selection for Multi-Task Imitation Learning
through Task Division [60.232542918414985]
Multi-task learning often suffers from negative transfer, sharing information that should be task-specific.
This is done by using proto-policies as modules to divide the tasks into simple sub-behaviours that can be shared.
We also demonstrate its ability to autonomously divide the tasks into both shared and task-specific sub-behaviours.
arXiv Detail & Related papers (2022-03-28T15:53:17Z) - Multi-Agent Policy Transfer via Task Relationship Modeling [28.421365805638953]
We try to discover and exploit common structures among tasks for more efficient transfer.
We propose to learn effect-based task representations as a common space of tasks, using an alternatively fixed training scheme.
As a result, the proposed method can help transfer learned cooperation knowledge to new tasks after training on a few source tasks.
arXiv Detail & Related papers (2022-03-09T01:49:21Z) - Behaviour-conditioned policies for cooperative reinforcement learning
tasks [41.74498230885008]
In various real-world tasks, an agent needs to cooperate with unknown partner agent types.
Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning.
We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour.
We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability.
arXiv Detail & Related papers (2021-10-04T09:16:41Z) - Adaptive Procedural Task Generation for Hard-Exploration Problems [78.20918366839399]
We introduce Adaptive Procedural Task Generation (APT-Gen) to facilitate reinforcement learning in hard-exploration problems.
At the heart of our approach is a task generator that learns to create tasks from a parameterized task space via a black-box procedural generation module.
To enable curriculum learning in the absence of a direct indicator of learning progress, we propose to train the task generator by balancing the agent's performance in the generated tasks and the similarity to the target tasks.
arXiv Detail & Related papers (2020-07-01T09:38:51Z) - Meta-Reinforcement Learning Robust to Distributional Shift via Model
Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time.
Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.