On the Use and Misuse of Absorbing States in Multi-agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2111.05992v1
- Date: Wed, 10 Nov 2021 23:45:08 GMT
- Title: On the Use and Misuse of Absorbing States in Multi-agent Reinforcement
Learning
- Authors: Andrew Cohen and Ervin Teng and Vincent-Pierre Berges and Ruo-Ping
Dong and Hunter Henry and Marwan Mattar and Alexander Zook and Sujoy Ganguly
- Abstract summary: Current MARL algorithms assume that the number of agents within a group remains fixed throughout an experiment.
In many practical problems, an agent may terminate before their teammates.
We present a novel architecture for an existing state-of-the-art MARL algorithm which uses attention instead of a fully connected layer with absorbing states.
- Score: 55.95253619768565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The creation and destruction of agents in cooperative multi-agent
reinforcement learning (MARL) is a critically under-explored area of research.
Current MARL algorithms often assume that the number of agents within a group
remains fixed throughout an experiment. However, in many practical problems, an
agent may terminate before their teammates. This early termination issue
presents a challenge: the terminated agent must learn from the group's success
or failure which occurs beyond its own existence. We refer to propagating value
from rewards earned by remaining teammates to terminated agents as the
Posthumous Credit Assignment problem. Current MARL methods handle this problem
by placing these agents in an absorbing state until the entire group of agents
reaches a termination condition. Although absorbing states enable existing
algorithms and APIs to handle terminated agents without modification, practical
training efficiency and resource use problems exist.
In this work, we first demonstrate that sample complexity increases with the
quantity of absorbing states in a toy supervised learning task for a fully
connected network, while attention is more robust to variable size input. Then,
we present a novel architecture for an existing state-of-the-art MARL algorithm
which uses attention instead of a fully connected layer with absorbing states.
Finally, we demonstrate that this novel architecture significantly outperforms
the standard architecture on tasks in which agents are created or destroyed
within episodes as well as standard multi-agent coordination tasks.
Related papers
- Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - An Algorithm For Adversary Aware Decentralized Networked MARL [0.0]
We introduce vulnerabilities in the consensus updates of existing MARL algorithms.
We provide an algorithm that allows non-adversarial agents to reach a consensus in the presence of adversaries.
arXiv Detail & Related papers (2023-05-09T16:02:31Z) - ACE: Cooperative Multi-agent Q-learning with Bidirectional
Action-Dependency [65.28061634546577]
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem.
In this paper, we propose bidirectional action-dependent Q-learning (ACE)
ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2022-11-29T10:22:55Z) - Scalable Multi-Agent Reinforcement Learning through Intelligent
Information Aggregation [6.09506921406322]
We propose a novel architecture for multi-agent reinforcement learning (MARL) which uses local information intelligently to compute paths for all the agents in a decentralized manner.
InforMARL aggregates information about the local neighborhood of agents for both the actor and the critic using a graph neural network and can be used in conjunction with any standard MARL algorithm.
arXiv Detail & Related papers (2022-11-03T20:02:45Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Information State Embedding in Partially Observable Cooperative
Multi-Agent Reinforcement Learning [19.617644643147948]
We introduce the concept of an information state embedding that serves to compress agents' histories.
We quantify how the compression error influences the resulting value functions for decentralized control.
The proposed embed-then-learn pipeline opens the black-box of existing (partially observable) MARL algorithms.
arXiv Detail & Related papers (2020-04-02T16:03:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.