Evaluating Generalization and Transfer Capacity of Multi-Agent
Reinforcement Learning Across Variable Number of Agents
- URL: http://arxiv.org/abs/2111.14177v1
- Date: Sun, 28 Nov 2021 15:29:46 GMT
- Title: Evaluating Generalization and Transfer Capacity of Multi-Agent
Reinforcement Learning Across Variable Number of Agents
- Authors: Bengisu Guresti, Nazim Kemal Ure
- Abstract summary: Multi-agent Reinforcement Learning (MARL) problems often require cooperation among agents in order to solve a task.
Centralization and decentralization are two approaches used for cooperation in MARL.
We adopt centralized training with decentralized execution paradigm and investigate the generalization and transfer capacity of the trained models across variable number of agents.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent Reinforcement Learning (MARL) problems often require cooperation
among agents in order to solve a task. Centralization and decentralization are
two approaches used for cooperation in MARL. While fully decentralized methods
are prone to converge to suboptimal solutions due to partial observability and
nonstationarity, the methods involving centralization suffer from scalability
limitations and lazy agent problem. Centralized training decentralized
execution paradigm brings out the best of these two approaches; however,
centralized training still has an upper limit of scalability not only for
acquired coordination performance but also for model size and training time. In
this work, we adopt the centralized training with decentralized execution
paradigm and investigate the generalization and transfer capacity of the
trained models across variable number of agents. This capacity is assessed by
training variable number of agents in a specific MARL problem and then
performing greedy evaluations with variable number of agents for each training
configuration. Thus, we analyze the evaluation performance for each combination
of agent count for training versus evaluation. We perform experimental
evaluations on predator prey and traffic junction environments and demonstrate
that it is possible to obtain similar or higher evaluation performance by
training with less agents. We conclude that optimal number of agents to perform
training may differ from the target number of agents and argue that transfer
across large number of agents can be a more efficient solution to scaling up
than directly increasing number of agents during training.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards [1.179778723980276]
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for sequential decision-making and control tasks.
The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals.
We propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies.
arXiv Detail & Related papers (2024-08-12T21:38:40Z) - Quantifying Agent Interaction in Multi-agent Reinforcement Learning for
Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL)
The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario.
We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - A Variational Approach to Mutual Information-Based Coordination for
Multi-Agent Reinforcement Learning [17.893310647034188]
We propose a new mutual information framework for multi-agent reinforcement learning.
Applying policy to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic.
arXiv Detail & Related papers (2023-03-01T12:21:30Z) - Taming Multi-Agent Reinforcement Learning with Estimator Variance
Reduction [12.94372063457462]
Centralised training with decentralised execution (CT-DE) serves as the foundation of many leading multi-agent reinforcement learning (MARL) algorithms.
It suffers from a critical drawback due to its reliance on learning from a single sample of the joint-action at a given state.
We propose an enhancement tool that accommodates any actor-critic MARL method.
arXiv Detail & Related papers (2022-09-02T13:44:00Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - DSDF: An approach to handle stochastic agents in collaborative
multi-agent reinforcement learning [0.0]
We show how thisity of agents, which could be a result of malfunction or aging of robots, can add to the uncertainty in coordination.
Our solution, DSDF which tunes the discounted factor for the agents according to uncertainty and use the values to update the utility networks of individual agents.
arXiv Detail & Related papers (2021-09-14T12:02:28Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic [54.2180984002807]
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
arXiv Detail & Related papers (2020-02-24T20:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.