POGEMA: Partially Observable Grid Environment for Multiple Agents
- URL: http://arxiv.org/abs/2206.10944v1
- Date: Wed, 22 Jun 2022 09:39:50 GMT
- Title: POGEMA: Partially Observable Grid Environment for Multiple Agents
- Authors: Alexey Skrynnik, Anton Andreychuk, Konstantin Yakovlev, Aleksandr I.
Panov
- Abstract summary: POGEMA is a sandbox for challenging partially observable multi-agent pathfinding (PO-MAPF) problems.
It can be tailored to a variety of PO-MAPF, which can serve as an excellent testing ground for planning and learning methods.
- Score: 64.88759709443819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce POGEMA (https://github.com/AIRI-Institute/pogema) a sandbox for
challenging partially observable multi-agent pathfinding (PO-MAPF) problems .
This is a grid-based environment that was specifically designed to be a
flexible, tunable and scalable benchmark. It can be tailored to a variety of
PO-MAPF, which can serve as an excellent testing ground for planning and
learning methods, and their combination, which will allow us to move towards
filling the gap between AI planning and learning.
Related papers
- MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale [46.35418789518417]
Multi-agent pathfinding is a challenging computational problem that typically requires to find collision-free paths for multiple agents in a shared environment.
We have created a foundation model for the MAPF problems called MAPF-GPT.
Using imitation learning, we have trained a policy on a set of sub-optimal expert trajectories that can generate actions in conditions of partial observability.
We show that MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers on a diverse range of problem instances.
arXiv Detail & Related papers (2024-08-29T12:55:10Z) - POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation [76.67608003501479]
We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators.
The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Scalable Mechanism Design for Multi-Agent Path Finding [87.40027406028425]
Multi-Agent Path Finding (MAPF) involves determining paths for multiple agents to travel simultaneously and collision-free through a shared area toward given goal locations.
Finding an optimal solution is often computationally infeasible, making the use of approximate, suboptimal algorithms essential.
We introduce the problem of scalable mechanism design for MAPF and propose three strategyproof mechanisms, two of which even use approximate MAPF algorithms.
arXiv Detail & Related papers (2024-01-30T14:26:04Z) - Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning [17.69984142788365]
Coverage path planning ( CPP) is the problem of finding a path that covers the entire free space of a confined area.
We investigate how suitable reinforcement learning is for this challenging problem.
We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation.
arXiv Detail & Related papers (2023-06-29T14:32:06Z) - Learning Logic Specifications for Soft Policy Guidance in POMCP [71.69251176275638]
Partially Observable Monte Carlo Planning (POMCP) is an efficient solver for Partially Observable Markov Decision Processes (POMDPs)
POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is reached.
In this paper, we use inductive logic programming to learn logic specifications from traces of POMCP executions.
arXiv Detail & Related papers (2023-03-16T09:37:10Z) - Multi-Agent Terraforming: Efficient Multi-Agent Path Finding via
Environment Manipulation [12.401344261399613]
Multi-agent pathfinding is concerned with planning collision-free paths for a team of agents from their start to goal locations in an environment cluttered with obstacles.
We introduce a new extension of MAPF, which we call Terraforming MAPF (tMAPF), where some agents are responsible for moving obstacles to clear the way for other agents.
We present extensions of two state-of-the-art algorithms, CBS and PBS, in order to tackle tMAPF, and demonstrate that they can consistently outperform the best solution possible under a static-obstacle setting.
arXiv Detail & Related papers (2022-03-20T12:18:35Z) - Explanation Generation for Multi-Modal Multi-Agent Path Finding with
Optimal Resource Utilization using Answer Set Programming [1.7132914341329848]
The real-world applications of mMAPF require flexibility and explainability.
This paper introduces a method for generating explanations for queries regarding the feasibility and optimality of solutions.
arXiv Detail & Related papers (2020-08-08T18:34:34Z) - Adaptive Informative Path Planning with Multimodal Sensing [36.16721115973077]
AIPPMS (MS for Multimodal Sensing)
We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning.
We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem.
arXiv Detail & Related papers (2020-03-21T20:28:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.