Dimension-Free Rates for Natural Policy Gradient in Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2109.11692v1
- Date: Thu, 23 Sep 2021 23:38:15 GMT
- Title: Dimension-Free Rates for Natural Policy Gradient in Multi-Agent
Reinforcement Learning
- Authors: Carlo Alfano, Patrick Rebeschini
- Abstract summary: We propose a scalable algorithm for cooperative multi-agent reinforcement learning.
We show that our algorithm converges to the globally optimal policy with a dimension-free statistical and computational complexity.
- Score: 22.310861786709538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cooperative multi-agent reinforcement learning is a decentralized paradigm in
sequential decision making where agents distributed over a network iteratively
collaborate with neighbors to maximize global (network-wide) notions of
rewards. Exact computations typically involve a complexity that scales
exponentially with the number of agents. To address this curse of
dimensionality, we design a scalable algorithm based on the Natural Policy
Gradient framework that uses local information and only requires agents to
communicate with neighbors within a certain range. Under standard assumptions
on the spatial decay of correlations for the transition dynamics of the
underlying Markov process and the localized learning policy, we show that our
algorithm converges to the globally optimal policy with a dimension-free
statistical and computational complexity, incurring a localization error that
does not depend on the number of agents and converges to zero exponentially
fast as a function of the range of communication.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.