Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation
- URL: http://arxiv.org/abs/2408.11607v1
- Date: Wed, 21 Aug 2024 13:32:46 GMT
- Title: Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation
- Authors: Patrick Benjamin, Alessandro Abate,
- Abstract summary: Decentralised agents can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system.
We introduce function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method.
We additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood.
- Score: 59.01527054553122
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent works have provided algorithms by which decentralised agents, which may be connected via a communication network, can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. However, these algorithms are given for tabular settings: this computationally limits the size of players' observation space, meaning that the algorithms are not able to handle anything but small state spaces, nor to generalise beyond policies depending on the ego player's state to so-called 'population-dependent' policies. We address this limitation by introducing function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method that has previously been employed only in finite-horizon, episodic, centralised settings. While this permits us to include the population's mean-field distribution in the observation for each player's policy, it is arguably unrealistic to assume that decentralised agents would have access to this global information: we therefore additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood, and to improve this estimate via communication over a given network. Our experiments showcase how the communication network allows decentralised agents to estimate the mean-field distribution for population-dependent policies, and that exchanging policy information helps networked agents to outperform both independent and even centralised agents in function-approximation settings, by an even greater margin than in tabular settings.
Related papers
- Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Distributed Policy Gradient for Linear Quadratic Networked Control with
Limited Communication Range [23.500806437272487]
We show that it is possible to approximate the exact gradient only using local information.
Compared with the centralized optimal controller, the performance gap decreases to zero exponentially as the communication and control ranges increase.
arXiv Detail & Related papers (2024-03-05T15:38:54Z) - Distributed Online Rollout for Multivehicle Routing in Unmapped
Environments [0.8437187555622164]
We present a fully distributed, online, and scalable reinforcement learning algorithm for the well-known multivehicle routing problem.
Agents self-organize into local clusters and independently apply a multiagent rollout scheme locally to each cluster.
Our algorithm achieves approximately a factor of two cost improvement over the base policy for a range of radii bounded from below and above by two and three times the critical sensing radius, respectively.
arXiv Detail & Related papers (2023-05-24T22:06:44Z) - Policy Evaluation in Decentralized POMDPs with Belief Sharing [39.550233049869036]
We consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly.
We propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network.
arXiv Detail & Related papers (2023-02-08T15:54:15Z) - Multi-Agent MDP Homomorphic Networks [100.74260120972863]
In cooperative multi-agent systems, complex symmetries arise between different configurations of the agents and their local observations.
Existing work on symmetries in single agent reinforcement learning can only be generalized to the fully centralized setting.
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information.
arXiv Detail & Related papers (2021-10-09T07:46:25Z) - Dimension-Free Rates for Natural Policy Gradient in Multi-Agent
Reinforcement Learning [22.310861786709538]
We propose a scalable algorithm for cooperative multi-agent reinforcement learning.
We show that our algorithm converges to the globally optimal policy with a dimension-free statistical and computational complexity.
arXiv Detail & Related papers (2021-09-23T23:38:15Z) - Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN)
Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot.
We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z) - An Online Learning Approach to Interpolation and Extrapolation in Domain
Generalization [53.592597682854944]
We recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test.
We show that ERM is provably minimax-optimal for both tasks.
arXiv Detail & Related papers (2021-02-25T19:06:48Z) - Cooperative Multi-Agent Reinforcement Learning with Partial Observations [16.895704973433382]
We propose a distributed zeroth-order policy optimization method for Multi-Agent Reinforcement Learning (MARL)
It allows the agents to compute the local policy gradients needed to update their local policy functions using local estimates of the global accumulated rewards.
We show that the proposed distributed zeroth-order policy optimization method with constant stepsize converges to the neighborhood of a policy that is a stationary point of the global objective function.
arXiv Detail & Related papers (2020-06-18T19:36:22Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.