Scalable Anytime Planning for Multi-Agent MDPs
- URL: http://arxiv.org/abs/2101.04788v1
- Date: Tue, 12 Jan 2021 22:50:17 GMT
- Title: Scalable Anytime Planning for Multi-Agent MDPs
- Authors: Shushman Choudhury, Jayesh K. Gupta, Peter Morales, Mykel J.
Kochenderfer
- Abstract summary: We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration.
Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection.
- Score: 37.69939216970677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a scalable tree search planning algorithm for large multi-agent
sequential decision problems that require dynamic collaboration. Teams of
agents need to coordinate decisions in many domains, but naive approaches fail
due to the exponential growth of the joint action space with the number of
agents. We circumvent this complexity through an anytime approach that allows
us to trade computation for approximation quality and also dynamically
coordinate actions. Our algorithm comprises three elements: online planning
with Monte Carlo Tree Search (MCTS), factored representations of local agent
interactions with coordination graphs, and the iterative Max-Plus method for
joint action selection. We evaluate our approach on the benchmark SysAdmin
domain with static coordination graphs and achieve comparable performance with
much lower computation cost than our MCTS baselines. We also introduce a
multi-drone delivery domain with dynamic, i.e., state-dependent coordination
graphs, and demonstrate how our approach scales to large problems on this
domain that are intractable for other MCTS methods. We provide an open-source
implementation of our algorithm at
https://github.com/JuliaPOMDP/FactoredValueMCTS.jl.
Related papers
- Performance-Aware Self-Configurable Multi-Agent Networks: A Distributed Submodular Approach for Simultaneous Coordination and Network Design [3.5527561584422465]
We present AlterNAting COordination and Network-Design Algorithm (Anaconda)
Anaconda is a scalable algorithm that also enjoys near-optimality guarantees.
We demonstrate in simulated scenarios of area monitoring and compare it with a state-of-the-art algorithm.
arXiv Detail & Related papers (2024-09-02T18:11:33Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Scalable Mechanism Design for Multi-Agent Path Finding [87.40027406028425]
Multi-Agent Path Finding (MAPF) involves determining paths for multiple agents to travel simultaneously and collision-free through a shared area toward given goal locations.
Finding an optimal solution is often computationally infeasible, making the use of approximate, suboptimal algorithms essential.
We introduce the problem of scalable mechanism design for MAPF and propose three strategyproof mechanisms, two of which even use approximate MAPF algorithms.
arXiv Detail & Related papers (2024-01-30T14:26:04Z) - Continuous Monte Carlo Graph Search [61.11769232283621]
Continuous Monte Carlo Graph Search ( CMCGS) is an extension of Monte Carlo Tree Search (MCTS) to online planning.
CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance.
It can be scaled up through parallelization, and it outperforms the Cross-Entropy Method (CEM) in continuous control with learned dynamics models.
arXiv Detail & Related papers (2022-10-04T07:34:06Z) - Learning Cooperation and Online Planning Through Simulation and Graph
Convolutional Network [5.505634045241288]
We introduce a simulation based online planning algorithm, that we call SiCLOP, for multi-agent cooperative environments.
Specifically, SiCLOP tailors Monte Carlo Tree Search (MCTS) and uses Coordination Graph (CG) and Graph Neural Network (GCN) to learn cooperation.
It also improves scalability through an effective pruning of action space.
arXiv Detail & Related papers (2021-10-16T05:54:32Z) - MACRPO: Multi-Agent Cooperative Recurrent Policy Optimization [17.825845543579195]
We propose a new multi-agent actor-critic method called textitMulti-Agent Cooperative Recurrent Proximal Policy Optimization (MACRPO)
We use a recurrent layer in critic's network architecture and propose a new framework to use a meta-trajectory to train the recurrent layer.
We evaluate our algorithm on three challenging multi-agent environments with continuous and discrete action spaces.
arXiv Detail & Related papers (2021-09-02T12:43:35Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.