Policy Evaluation in Decentralized POMDPs with Belief Sharing
- URL: http://arxiv.org/abs/2302.04151v2
- Date: Tue, 16 May 2023 11:43:37 GMT
- Title: Policy Evaluation in Decentralized POMDPs with Belief Sharing
- Authors: Mert Kayaalp, Fatima Ghadieh, Ali H. Sayed
- Abstract summary: We consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly.
We propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network.
- Score: 39.550233049869036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most works on multi-agent reinforcement learning focus on scenarios where the
state of the environment is fully observable. In this work, we consider a
cooperative policy evaluation task in which agents are not assumed to observe
the environment state directly. Instead, agents can only have access to noisy
observations and to belief vectors. It is well-known that finding global
posterior distributions under multi-agent settings is generally NP-hard. As a
remedy, we propose a fully decentralized belief forming strategy that relies on
individual updates and on localized interactions over a communication network.
In addition to the exchange of the beliefs, agents exploit the communication
network by exchanging value function parameter estimates as well. We
analytically show that the proposed strategy allows information to diffuse over
the network, which in turn allows the agents' parameters to have a bounded
difference with a centralized baseline. A multi-sensor target tracking
application is considered in the simulations.
Related papers
- Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation [59.01527054553122]
Decentralised agents can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system.
We introduce function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method.
We additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood.
arXiv Detail & Related papers (2024-08-21T13:32:46Z) - Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity [44.2308932471393]
We show that exchanging model estimates leads to linear convergence speedups in the number of agents.
In a low-heterogeneity regime, exchanging model estimates leads to linear convergence speedups in the number of agents.
arXiv Detail & Related papers (2023-02-04T17:53:55Z) - Decentralized Multi-agent Filtering [12.02857497237958]
This paper addresses the considerations that comes along with adopting decentralized communication for multi-agent localization applications in discrete state spaces.
We extend the original formulation of the Bayes filter, a foundational probabilistic tool for discrete state estimation, by appending a step of greedy belief sharing.
arXiv Detail & Related papers (2023-01-21T02:41:32Z) - Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z) - Multi-Agent MDP Homomorphic Networks [100.74260120972863]
In cooperative multi-agent systems, complex symmetries arise between different configurations of the agents and their local observations.
Existing work on symmetries in single agent reinforcement learning can only be generalized to the fully centralized setting.
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information.
arXiv Detail & Related papers (2021-10-09T07:46:25Z) - Distributed Q-Learning with State Tracking for Multi-agent Networked
Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network.
We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z) - Multi-Agent Decentralized Belief Propagation on Graphs [0.0]
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs)
We propose a decentralized belief propagation algorithm for the problem.
Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.
arXiv Detail & Related papers (2020-11-06T18:16:26Z) - Multi-Agent Trust Region Policy Optimization [34.91180300856614]
We show that the policy update of TRPO can be transformed into a distributed consensus optimization problem for multi-agent cases.
We propose a decentralized MARL algorithm, which we call multi-agent TRPO (MATRPO)
arXiv Detail & Related papers (2020-10-15T17:49:47Z) - Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion.
We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.