Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes
- URL: http://arxiv.org/abs/2307.16706v7
- Date: Thu, 13 Jun 2024 11:39:59 GMT
- Title: Continuous-Time Distributed Dynamic Programming for Networked Multi-Agent Markov Decision Processes
- Authors: Donghwan Lee, Han-Dong Lim, Do Wan Kim,
- Abstract summary: This paper investigates continuous-time distributed dynamic programming (DP) algorithms for networked multi-agent Markov decision problems (MAMDPs)
In our study, we adopt a distributed multi-agent framework where individual agents have access only to their own rewards, lacking insights into the rewards of other agents.
- Score: 7.464789724562025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main goal of this paper is to investigate continuous-time distributed dynamic programming (DP) algorithms for networked multi-agent Markov decision problems (MAMDPs). In our study, we adopt a distributed multi-agent framework where individual agents have access only to their own rewards, lacking insights into the rewards of other agents. Moreover, each agent has the ability to share its parameters with neighboring agents through a communication network, represented by a graph. We first introduce a novel distributed DP, inspired by the distributed optimization method of Wang and Elia. Next, a new distributed DP is introduced through a decoupling process. The convergence of the DP algorithms is proved through systems and control perspectives. The study in this paper sets the stage for new distributed temporal different learning algorithms.
Related papers
- Scalable spectral representations for multi-agent reinforcement learning in network MDPs [13.782868855372774]
A popular model for multi-agent control, Network Markov Decision Processes (MDPs) pose a significant challenge to efficient learning.
We first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent.
We design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm.
arXiv Detail & Related papers (2024-10-22T17:45:45Z) - Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax [73.03684002513218]
We enhance Deep InfoMax (DIM) to enable automatic matching of learned representations to a selected prior distribution.
We show that such modification allows for learning uniformly and normally distributed representations.
The results indicate a moderate trade-off between the performance on the downstream tasks and quality of DM.
arXiv Detail & Related papers (2024-10-09T15:40:04Z) - Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - Decentralised Q-Learning for Multi-Agent Markov Decision Processes with
a Satisfiability Criterion [0.0]
We propose a reinforcement learning algorithm to solve a multi-agent Markov decision process (MMDP)
The goal is to lower the time average cost of each agent to below a pre-specified agent-specific bound.
arXiv Detail & Related papers (2023-11-21T13:56:44Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Emergence of Theory of Mind Collaboration in Multiagent Systems [65.97255691640561]
We propose an adaptive training algorithm to develop effective collaboration between agents with ToM.
We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
arXiv Detail & Related papers (2021-09-30T23:28:00Z) - Learning to Coordinate via Multiple Graph Neural Networks [16.226702761758595]
MGAN is a new algorithm that combines graph convolutional networks and value-decomposition methods.
We show the amazing ability of the graph network in representation learning by visualizing the output of the graph network.
arXiv Detail & Related papers (2021-04-08T04:33:00Z) - MS*: A New Exact Algorithm for Multi-agent Simultaneous Multi-goal
Sequencing and Path Finding [10.354181009277623]
In multi-agent applications such as surveillance and logistics, fleets of mobile agents are often expected to coordinate and safely visit a large number of goal locations.
In this article, we introduce a new algorithm called MS* which computes an optimal solution for this multi-agent problem.
Numerical results show that our new algorithm can solve the multi-agent problem with 20 agents and 50 goals in a minute of CPU time on a standard laptop.
arXiv Detail & Related papers (2021-03-18T01:57:35Z) - Multi-Agent Decentralized Belief Propagation on Graphs [0.0]
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs)
We propose a decentralized belief propagation algorithm for the problem.
Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.
arXiv Detail & Related papers (2020-11-06T18:16:26Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.