Better Regularization for Sequential Decision Spaces: Fast Convergence
Rates for Nash, Correlated, and Team Equilibria
- URL: http://arxiv.org/abs/2105.12954v1
- Date: Thu, 27 May 2021 06:10:24 GMT
- Title: Better Regularization for Sequential Decision Spaces: Fast Convergence
Rates for Nash, Correlated, and Team Equilibria
- Authors: Gabriele Farina, Christian Kroer, Tuomas Sandholm
- Abstract summary: We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games.
By instantiating first-order methods with our regularizers, we develop the first accelerated first-order methods for computing correlated equilibra and ex-ante coordinated team equilibria.
- Score: 121.36609493711292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the application of iterative first-order methods to the problem of
computing equilibria of large-scale two-player extensive-form games.
First-order methods must typically be instantiated with a regularizer that
serves as a distance-generating function for the decision sets of the players.
For the case of two-player zero-sum games, the state-of-the-art theoretical
convergence rate for Nash equilibrium is achieved by using the dilated entropy
function. In this paper, we introduce a new entropy-based distance-generating
function for two-player zero-sum games, and show that this function achieves
significantly better strong convexity properties than the dilated entropy,
while maintaining the same easily-implemented closed-form proximal mapping.
Extensive numerical simulations show that these superior theoretical properties
translate into better numerical performance as well.
We then generalize our new entropy distance function, as well as general
dilated distance functions, to the scaled extension operator. The scaled
extension operator is a way to recursively construct convex sets, which
generalizes the decision polytope of extensive-form games, as well as the
convex polytopes corresponding to correlated and team equilibria. By
instantiating first-order methods with our regularizers, we develop the first
accelerated first-order methods for computing correlated equilibra and ex-ante
coordinated team equilibria. Our methods have a guaranteed $1/T$ rate of
convergence, along with linear-time proximal updates.
Related papers
- On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games [44.861519860614735]
First-order methods are arguably the most scalable algorithms for equilibrium computation in large extensive-form games.
A distance-generating function, acting as a regularizer for the strategy, must be chosen.
We establish that the weight-one dilated entropy (DilEnt) distancegenerating function is optimal up to logarithmic factors.
arXiv Detail & Related papers (2024-10-30T19:03:33Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Two-Timescale Q-Learning with Function Approximation in Zero-Sum
Stochastic Games [31.554420227087043]
We propose a two-timescale smoothed $Q$-learning algorithm with function approximation that is payoff-based, convergent, rational, and symmetric between the two players.
In two-timescale $Q$-learning, the fast-timescales are updated in spirit to the gradient descent and the slow-timescales are updated by taking a convex combination between its previous and the latest fast-timescale.
The key novelty lies in constructing a valid Lyapunov function to capture the evolution of the slow-timescales.
arXiv Detail & Related papers (2023-12-08T08:39:36Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Global Nash Equilibrium in Non-convex Multi-player Game: Theory and
Algorithms [66.8634598612777]
We show that Nash equilibrium (NE) is acceptable to all players in a multi-player game.
We also show that no one can benefit unilaterally from the general theory step by step.
arXiv Detail & Related papers (2023-01-19T11:36:50Z) - Learning Two-Player Mixture Markov Games: Kernel Function Approximation
and Correlated Equilibrium [157.0902680672422]
We consider learning Nash equilibria in two-player zero-sum Markov Games with nonlinear function approximation.
We propose a novel online learning algorithm to find a Nash equilibrium by minimizing the duality gap.
arXiv Detail & Related papers (2022-08-10T14:21:54Z) - Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games [16.09467599829253]
We study the problem of finding Nash equilibrium in a two-player zero-sum game.
Our main contribution is to show that under proper choices of the regularization parameter, the gradient descents to the Nash equilibrium of the original unregularized problem.
arXiv Detail & Related papers (2022-05-27T03:24:12Z) - Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation [78.48747645545944]
We study the problem of finding optimal correlated equilibria of various sorts in extensive-form games.
We introduce a new algorithm for computing optimal equilibria in all three notions.
arXiv Detail & Related papers (2022-03-14T15:21:18Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.