Related papers: Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

URL: http://arxiv.org/abs/2205.13746v1
Date: Fri, 27 May 2022 03:24:12 GMT
Title: Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
Authors: Sihan Zeng, Thinh T. Doan, Justin Romberg
Abstract summary: We study the problem of finding Nash equilibrium in a two-player zero-sum game. Our main contribution is to show that under proper choices of the regularization parameter, the gradient descents to the Nash equilibrium of the original unregularized problem.
Score: 16.09467599829253
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of finding the Nash equilibrium in a two-player zero-sum Markov game. Due to its formulation as a minimax optimization program, a natural approach to solve the problem is to perform gradient descent/ascent with respect to each player in an alternating fashion. However, due to the non-convexity/non-concavity of the underlying objective function, theoretical understandings of this method are limited. In our paper, we consider solving an entropy-regularized variant of the Markov game. The regularization introduces structure into the optimization landscape that make the solutions more identifiable and allow the problem to be solved more efficiently. Our main contribution is to show that under proper choices of the regularization parameter, the gradient descent ascent algorithm converges to the Nash equilibrium of the original unregularized problem. We explicitly characterize the finite-time performance of the last iterate of our algorithm, which vastly improves over the existing convergence bound of the gradient descent ascent algorithm without regularization. Finally, we complement the analysis with numerical simulations that illustrate the accelerated convergence of the algorithm.

Related papers

Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
Efficient Low-rank Identification via Accelerated Iteratively Reweighted Nuclear Norm Minimization [8.879403568685499]
We introduce an adaptive updating strategy for smoothing parameters. This behavior transforms the algorithm into one that effectively solves problems after a few iterations. We prove the global proposed experiment, guaranteeing that every iteration is a critical one.
arXiv Detail & Related papers (2024-06-22T02:37:13Z)
Strictly Low Rank Constraint Optimization -- An Asymptotically $\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution. We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z)
PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium [58.26573117273626]
We consider the non-AL equilibrium nonconptotic objective function in two-player zero-sum continuous games. Our novel insights into the particle-based algorithms for continuous distribution strategies are presented.
arXiv Detail & Related papers (2023-03-02T05:08:15Z)
First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems [88.58409977434269]
We consider the problem of computing an equilibrium in a class of nonlinear generalized Nash equilibrium problems (NGNEPs) Our contribution is to provide two simple first-order algorithmic frameworks based on the quadratic penalty method and the augmented Lagrangian method. We provide nonasymptotic theoretical guarantees for these algorithms.
arXiv Detail & Related papers (2022-04-07T00:11:05Z)
Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization [40.21627891283402]
This paper investigates the problem of computing the equilibrium of competitive games. Motivated by the algorithmic role of entropy regularization, we develop provably efficient extragradient methods.
arXiv Detail & Related papers (2021-05-31T17:51:15Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees. Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Explicit Regularization of Stochastic Gradient Methods through Duality [9.131027490864938]
We propose randomized Dykstra-style algorithms based on randomized dual coordinate ascent. For accelerated coordinate descent, we obtain a new algorithm that has better convergence properties than existing gradient methods in the interpolating regime.
arXiv Detail & Related papers (2020-03-30T20:44:56Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.