Related papers: Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments

Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments

URL: http://arxiv.org/abs/2108.06148v1
Date: Fri, 13 Aug 2021 09:44:47 GMT
Title: Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments
Authors: Vasilii Davydov, Alexey Skrynnik, Konstantin Yakovlev, Aleksandr I. Panov
Abstract summary: We consider the problem of multi-agent navigation in partially observable grid environments. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
Score: 62.997667081978825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we consider the problem of multi-agent navigation in partially observable grid environments. This problem is challenging for centralized planning approaches as they, typically, rely on the full knowledge of the environment. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals. To tackle the challenge associated with learning cooperative behavior, i.e. in many cases agents need to yield to each other to accomplish a mission, we use a mixing Q-network that complements learning individual policies. In the experimental evaluation, we show that such approach leads to plausible results and scales well to large number of agents.

Related papers

SIGMA: Sheaf-Informed Geometric Multi-Agent Pathfinding [4.801673346687721]
The Multi-Agent Path Finding (MAPF) problem aims to determine the shortest and collision-free paths for multiple agents in a known, potentially obstacle-ridden environment. We introduce a new framework that applies sheaf theory to decentralized deep reinforcement learning, enabling agents to learn geometric cross-dependencies between each other. In particular, we incorporate a neural network to approximately model the consensus in latent space based on sheaf theory and train it through self-supervised learning.
arXiv Detail & Related papers (2025-02-10T13:17:34Z)
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding [76.67608003501479]
We introduce POGEMA, a comprehensive set of tools that includes a fast environment for learning, a problem instance generator, and a visualization toolkit. We also introduce and define an evaluation protocol that specifies a range of domain-related metrics, computed based on primary evaluation indicators. The results of this comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z)
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm. HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies. HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z)
Joint Intrinsic Motivation for Coordinated Exploration in Multi-Agent Deep Reinforcement Learning [0.0]
We propose an approach for rewarding strategies where agents collectively exhibit novel behaviors. Jim rewards joint trajectories based on a centralized measure of novelty designed to function in continuous environments. Results show that joint exploration is crucial for solving tasks where the optimal strategy requires a high level of coordination.
arXiv Detail & Related papers (2024-02-06T13:02:00Z)
Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation Learning [13.060023718506917]
imitation learning (IL) is a problem of learning to mimic expert behaviors from demonstrations in cooperative multi-agent systems. We introduce a novel multi-agent IL algorithm designed to address these challenges. Our approach enables the centralized learning by leveraging mixing networks to aggregate decentralized Q functions.
arXiv Detail & Related papers (2023-10-10T17:11:20Z)
Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning [46.354187895184154]
Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph. In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent. We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
arXiv Detail & Related papers (2023-10-02T13:51:32Z)
Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z)
Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning [11.91425153754564]
We show that in environments with a highly multi-modal reward landscape, value decomposition, and parameter sharing can be problematic and lead to undesired outcomes. In contrast, policy gradient (PG) methods with individual policies provably converge to an optimal solution in these cases. We present practical suggestions on implementing multi-agent PG algorithms for either high rewards or diverse emergent behaviors.
arXiv Detail & Related papers (2022-06-15T13:03:05Z)
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning. The goal is selected from multiple projected state spaces via a normalized entropy-based technique. We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z)
Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML. We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning [13.733491423871383]
We develop a framework for solving multi-task reinforcement learning problems. The goal is to learn a common policy that operates effectively in different environments. We highlight two fundamental challenges in MTRL that are not present in its single task counterpart.
arXiv Detail & Related papers (2020-06-08T03:28:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.