Related papers: On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning

On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning

URL: http://arxiv.org/abs/2111.05992v1
Date: Wed, 10 Nov 2021 23:45:08 GMT
Title: On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning
Authors: Andrew Cohen and Ervin Teng and Vincent-Pierre Berges and Ruo-Ping Dong and Hunter Henry and Marwan Mattar and Alexander Zook and Sujoy Ganguly
Abstract summary: Current MARL algorithms assume that the number of agents within a group remains fixed throughout an experiment. In many practical problems, an agent may terminate before their teammates. We present a novel architecture for an existing state-of-the-art MARL algorithm which uses attention instead of a fully connected layer with absorbing states.
Score: 55.95253619768565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The creation and destruction of agents in cooperative multi-agent reinforcement learning (MARL) is a critically under-explored area of research. Current MARL algorithms often assume that the number of agents within a group remains fixed throughout an experiment. However, in many practical problems, an agent may terminate before their teammates. This early termination issue presents a challenge: the terminated agent must learn from the group's success or failure which occurs beyond its own existence. We refer to propagating value from rewards earned by remaining teammates to terminated agents as the Posthumous Credit Assignment problem. Current MARL methods handle this problem by placing these agents in an absorbing state until the entire group of agents reaches a termination condition. Although absorbing states enable existing algorithms and APIs to handle terminated agents without modification, practical training efficiency and resource use problems exist. In this work, we first demonstrate that sample complexity increases with the quantity of absorbing states in a toy supervised learning task for a fully connected network, while attention is more robust to variable size input. Then, we present a novel architecture for an existing state-of-the-art MARL algorithm which uses attention instead of a fully connected layer with absorbing states. Finally, we demonstrate that this novel architecture significantly outperforms the standard architecture on tasks in which agents are created or destroyed within episodes as well as standard multi-agent coordination tasks.

Related papers

Deep Multi-Agent Reinforcement Learning for Decentralized Active Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning. We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning. We propose MADiff, a novel generative multi-agent learning framework to tackle this problem. Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)
An Algorithm For Adversary Aware Decentralized Networked MARL [0.0]
We introduce vulnerabilities in the consensus updates of existing MARL algorithms. We provide an algorithm that allows non-adversarial agents to reach a consensus in the presence of adversaries.
arXiv Detail & Related papers (2023-05-09T16:02:31Z)
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency [65.28061634546577]
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem. In this paper, we propose bidirectional action-dependent Q-learning (ACE) ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2022-11-29T10:22:55Z)
Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation [6.09506921406322]
We propose a novel architecture for multi-agent reinforcement learning (MARL) which uses local information intelligently to compute paths for all the agents in a decentralized manner. InforMARL aggregates information about the local neighborhood of agents for both the actor and the critic using a graph neural network and can be used in conjunction with any standard MARL algorithm.
arXiv Detail & Related papers (2022-11-03T20:02:45Z)
Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space. Also, we propose a novel framework to adopt the multi-agent options in the MARL process. We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z)
Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent. We propose a novel episodic memory, LeGEM, for model-free MARL algorithms. We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z)
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning. The goal is selected from multiple projected state spaces via a normalized entropy-based technique. We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z)
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn) UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features. Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning [19.617644643147948]
We introduce the concept of an information state embedding that serves to compress agents' histories. We quantify how the compression error influences the resulting value functions for decentralized control. The proposed embed-then-learn pipeline opens the black-box of existing (partially observable) MARL algorithms.
arXiv Detail & Related papers (2020-04-02T16:03:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.