Policy Evaluation in Decentralized POMDPs with Belief Sharing
- URL: http://arxiv.org/abs/2302.04151v2
- Date: Tue, 16 May 2023 11:43:37 GMT
- Title: Policy Evaluation in Decentralized POMDPs with Belief Sharing
- Authors: Mert Kayaalp, Fatima Ghadieh, Ali H. Sayed
- Abstract summary: We consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly.
We propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network.
- Score: 39.550233049869036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most works on multi-agent reinforcement learning focus on scenarios where the
state of the environment is fully observable. In this work, we consider a
cooperative policy evaluation task in which agents are not assumed to observe
the environment state directly. Instead, agents can only have access to noisy
observations and to belief vectors. It is well-known that finding global
posterior distributions under multi-agent settings is generally NP-hard. As a
remedy, we propose a fully decentralized belief forming strategy that relies on
individual updates and on localized interactions over a communication network.
In addition to the exchange of the beliefs, agents exploit the communication
network by exchanging value function parameter estimates as well. We
analytically show that the proposed strategy allows information to diffuse over
the network, which in turn allows the agents' parameters to have a bounded
difference with a centralized baseline. A multi-sensor target tracking
application is considered in the simulations.
Related papers
- Zeroth-order Asynchronous Learning with Bounded Delays with a Use-case
in Resource Allocation in Communication Networks [12.216015676346032]
This paper focuses on a scenario where agents collaborate toward a unified mission while potentially having distinct tasks.
Within this context, the objective for the agents is to optimize their local parameters based on the aggregate of local reward functions.
This paper presents theoretical convergence analyses and establishes a convergence rate for the proposed approach.
arXiv Detail & Related papers (2023-11-08T11:12:27Z) - Federated Natural Policy Gradient Methods for Multi-task Reinforcement
Learning [49.65958529941962]
Federated reinforcement learning (RL) enables collaborative decision making of multiple distributed agents without sharing local data trajectories.
In this work, we consider a multi-task setting, in which each agent has its own private reward function corresponding to different tasks.
We learn a globally optimal policy that maximizes the sum of the discounted total rewards of all the agents in a decentralized manner.
arXiv Detail & Related papers (2023-11-01T00:15:18Z) - Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity [44.2308932471393]
We show that exchanging model estimates leads to linear convergence speedups in the number of agents.
In a low-heterogeneity regime, exchanging model estimates leads to linear convergence speedups in the number of agents.
arXiv Detail & Related papers (2023-02-04T17:53:55Z) - Decentralized Multi-agent Filtering [12.02857497237958]
This paper addresses the considerations that comes along with adopting decentralized communication for multi-agent localization applications in discrete state spaces.
We extend the original formulation of the Bayes filter, a foundational probabilistic tool for discrete state estimation, by appending a step of greedy belief sharing.
arXiv Detail & Related papers (2023-01-21T02:41:32Z) - Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z) - Multi-Agent MDP Homomorphic Networks [100.74260120972863]
In cooperative multi-agent systems, complex symmetries arise between different configurations of the agents and their local observations.
Existing work on symmetries in single agent reinforcement learning can only be generalized to the fully centralized setting.
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information.
arXiv Detail & Related papers (2021-10-09T07:46:25Z) - Distributed Q-Learning with State Tracking for Multi-agent Networked
Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network.
We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z) - Multi-Agent Decentralized Belief Propagation on Graphs [0.0]
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs)
We propose a decentralized belief propagation algorithm for the problem.
Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.
arXiv Detail & Related papers (2020-11-06T18:16:26Z) - Multi-Agent Trust Region Policy Optimization [34.91180300856614]
We show that the policy update of TRPO can be transformed into a distributed consensus optimization problem for multi-agent cases.
We propose a decentralized MARL algorithm, which we call multi-agent TRPO (MATRPO)
arXiv Detail & Related papers (2020-10-15T17:49:47Z) - Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion.
We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.