Local Optimization Achieves Global Optimality in Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2305.04819v1
- Date: Mon, 8 May 2023 16:20:03 GMT
- Title: Local Optimization Achieves Global Optimality in Multi-Agent
Reinforcement Learning
- Authors: Yulai Zhao, Zhuoran Yang, Zhaoran Wang, Jason D. Lee
- Abstract summary: We present a multi-agent PPO algorithm in which the local policy of each agent is updated similarly to vanilla PPO.
We prove that with standard regularity conditions on the Markov game and problem-dependent quantities, our algorithm converges to the globally optimal policy at a sublinear rate.
- Score: 139.53668999720605
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Policy optimization methods with function approximation are widely used in
multi-agent reinforcement learning. However, it remains elusive how to design
such algorithms with statistical guarantees. Leveraging a multi-agent
performance difference lemma that characterizes the landscape of multi-agent
policy optimization, we find that the localized action value function serves as
an ideal descent direction for each local policy. Motivated by the observation,
we present a multi-agent PPO algorithm in which the local policy of each agent
is updated similarly to vanilla PPO. We prove that with standard regularity
conditions on the Markov game and problem-dependent quantities, our algorithm
converges to the globally optimal policy at a sublinear rate. We extend our
algorithm to the off-policy setting and introduce pessimism to policy
evaluation, which aligns with experiments. To our knowledge, this is the first
provably convergent multi-agent PPO algorithm in cooperative Markov games.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.