Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
- URL: http://arxiv.org/abs/2205.13746v1
- Date: Fri, 27 May 2022 03:24:12 GMT
- Title: Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
- Authors: Sihan Zeng, Thinh T. Doan, Justin Romberg
- Abstract summary: We study the problem of finding Nash equilibrium in a two-player zero-sum game.
Our main contribution is to show that under proper choices of the regularization parameter, the gradient descents to the Nash equilibrium of the original unregularized problem.
- Score: 16.09467599829253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of finding the Nash equilibrium in a two-player zero-sum
Markov game. Due to its formulation as a minimax optimization program, a
natural approach to solve the problem is to perform gradient descent/ascent
with respect to each player in an alternating fashion. However, due to the
non-convexity/non-concavity of the underlying objective function, theoretical
understandings of this method are limited. In our paper, we consider solving an
entropy-regularized variant of the Markov game. The regularization introduces
structure into the optimization landscape that make the solutions more
identifiable and allow the problem to be solved more efficiently. Our main
contribution is to show that under proper choices of the regularization
parameter, the gradient descent ascent algorithm converges to the Nash
equilibrium of the original unregularized problem. We explicitly characterize
the finite-time performance of the last iterate of our algorithm, which vastly
improves over the existing convergence bound of the gradient descent ascent
algorithm without regularization. Finally, we complement the analysis with
numerical simulations that illustrate the accelerated convergence of the
algorithm.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.