Related papers: Policy Evaluation in Decentralized POMDPs with Belief Sharing

Policy Evaluation in Decentralized POMDPs with Belief Sharing

URL: http://arxiv.org/abs/2302.04151v2
Date: Tue, 16 May 2023 11:43:37 GMT
Title: Policy Evaluation in Decentralized POMDPs with Belief Sharing
Authors: Mert Kayaalp, Fatima Ghadieh, Ali H. Sayed
Abstract summary: We consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly. We propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network.
Score: 39.550233049869036
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most works on multi-agent reinforcement learning focus on scenarios where the state of the environment is fully observable. In this work, we consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly. Instead, agents can only have access to noisy observations and to belief vectors. It is well-known that finding global posterior distributions under multi-agent settings is generally NP-hard. As a remedy, we propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network. In addition to the exchange of the beliefs, agents exploit the communication network by exchanging value function parameter estimates as well. We analytically show that the proposed strategy allows information to diffuse over the network, which in turn allows the agents' parameters to have a bounded difference with a centralized baseline. A multi-sensor target tracking application is considered in the simulations.

Related papers

Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches. We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation [59.01527054553122]
Decentralised agents can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. We introduce function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method. We additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood.
arXiv Detail & Related papers (2024-08-21T13:32:46Z)
Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity [44.2308932471393]
We show that exchanging model estimates leads to linear convergence speedups in the number of agents. In a low-heterogeneity regime, exchanging model estimates leads to linear convergence speedups in the number of agents.
arXiv Detail & Related papers (2023-02-04T17:53:55Z)
Decentralized Multi-agent Filtering [12.02857497237958]
This paper addresses the considerations that comes along with adopting decentralized communication for multi-agent localization applications in discrete state spaces. We extend the original formulation of the Bayes filter, a foundational probabilistic tool for discrete state estimation, by appending a step of greedy belief sharing.
arXiv Detail & Related papers (2023-01-21T02:41:32Z)
Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints. We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z)
Multi-Agent MDP Homomorphic Networks [100.74260120972863]
In cooperative multi-agent systems, complex symmetries arise between different configurations of the agents and their local observations. Existing work on symmetries in single agent reinforcement learning can only be generalized to the fully centralized setting. This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information.
arXiv Detail & Related papers (2021-10-09T07:46:25Z)
Distributed Q-Learning with State Tracking for Multi-agent Networked Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network. We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z)
Multi-Agent Decentralized Belief Propagation on Graphs [0.0]
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs) We propose a decentralized belief propagation algorithm for the problem. Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.
arXiv Detail & Related papers (2020-11-06T18:16:26Z)
Multi-Agent Trust Region Policy Optimization [34.91180300856614]
We show that the policy update of TRPO can be transformed into a distributed consensus optimization problem for multi-agent cases. We propose a decentralized MARL algorithm, which we call multi-agent TRPO (MATRPO)
arXiv Detail & Related papers (2020-10-15T17:49:47Z)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.