A Novel Point-based Algorithm for Multi-agent Control Using the Common
Information Approach
- URL: http://arxiv.org/abs/2304.04346v1
- Date: Mon, 10 Apr 2023 01:27:43 GMT
- Title: A Novel Point-based Algorithm for Multi-agent Control Using the Common
Information Approach
- Authors: Dengwang Tang, Ashutosh Nayyar, Rahul Jain
- Abstract summary: We propose a new algorithm for multi-agent control problems, called coordinator's search value (CHSVI)
The algorithm combines the CI approach and point-based POMDP algorithms for large action spaces.
We demonstrate the algorithm through optimally solving several benchmark problems.
- Score: 8.733794945008562
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The Common Information (CI) approach provides a systematic way to transform a
multi-agent stochastic control problem to a single-agent partially observed
Markov decision problem (POMDP) called the coordinator's POMDP. However, such a
POMDP can be hard to solve due to its extraordinarily large action space. We
propose a new algorithm for multi-agent stochastic control problems, called
coordinator's heuristic search value iteration (CHSVI), that combines the CI
approach and point-based POMDP algorithms for large action spaces. We
demonstrate the algorithm through optimally solving several benchmark problems.
Related papers
- Learning to Solve the Min-Max Mixed-Shelves Picker-Routing Problem via Hierarchical and Parallel Decoding [0.3867363075280544]
The Mixed-Shelves Picker Routing Problem (MSPRP) is a fundamental challenge in logistics, where pickers must navigate a mixed-shelves environment to retrieve SKUs efficiently.
We propose a novel hierarchical and parallel decoding approach for solving the min-max variant of the MSPRP via multi-agent reinforcement learning.
Experiments show state-of-the-art performance in both solution quality and inference speed, particularly for large-scale and out-of-distribution instances.
arXiv Detail & Related papers (2025-02-14T15:42:30Z) - A Multiagent Path Search Algorithm for Large-Scale Coalition Structure Generation [61.08720171136229]
Coalition structure generation is a fundamental computational problem in multiagent systems.
We develop SALDAE, a multiagent path finding algorithm for CSG that operates on a graph of coalition structures.
arXiv Detail & Related papers (2025-02-14T15:21:27Z) - Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling [2.3034630097498883]
The study introduces the Reinforcement Learning environment and conducts empirical analyses.
The experiments employ various deep neural network policies for single- and Multi-Agent approaches.
While Single-Agent algorithms perform adequately in reduced scenarios, Multi-Agent approaches reveal challenges in cooperative learning but a scalable capacity.
arXiv Detail & Related papers (2024-11-12T08:27:27Z) - Decentralised Q-Learning for Multi-Agent Markov Decision Processes with
a Satisfiability Criterion [0.0]
We propose a reinforcement learning algorithm to solve a multi-agent Markov decision process (MMDP)
The goal is to lower the time average cost of each agent to below a pre-specified agent-specific bound.
arXiv Detail & Related papers (2023-11-21T13:56:44Z) - Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control
Approach [0.3093890460224435]
We address the solution of the popular Wordle puzzle, using new reinforcement learning methods.
For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost.
arXiv Detail & Related papers (2022-11-15T03:46:41Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - Emergence of Theory of Mind Collaboration in Multiagent Systems [65.97255691640561]
We propose an adaptive training algorithm to develop effective collaboration between agents with ToM.
We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
arXiv Detail & Related papers (2021-09-30T23:28:00Z) - Permutation Invariant Policy Optimization for Mean-Field Multi-Agent
Reinforcement Learning: A Principled Approach [128.62787284435007]
We propose the mean-field proximal policy optimization (MF-PPO) algorithm, at the core of which is a permutation-invariant actor-critic neural architecture.
We prove that MF-PPO attains the globally optimal policy at a sublinear rate of convergence.
In particular, we show that the inductive bias introduced by the permutation-invariant neural architecture enables MF-PPO to outperform existing competitors.
arXiv Detail & Related papers (2021-05-18T04:35:41Z) - Planning in Markov Decision Processes with Gap-Dependent Sample
Complexity [48.98199700043158]
We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process.
We prove an upper bound on the number of calls to the generative models needed for MDP-GapE to identify a near-optimal action with high probability.
arXiv Detail & Related papers (2020-06-10T15:05:51Z) - A Novel Multi-Agent System for Complex Scheduling Problems [2.294014185517203]
This paper is the conception and implementation of a multi-agent system that is applicable in various problem domains.
We simulate a NP-hard scheduling problem to demonstrate the validity of our approach.
This paper highlights the advantages of the agent-based approach, like the reduction in layout complexity, improved control of complicated systems, and extendability.
arXiv Detail & Related papers (2020-04-20T14:04:58Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.