Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games
- URL: http://arxiv.org/abs/2401.15240v2
- Date: Wed, 1 May 2024 21:52:12 GMT
- Title: Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games
- Authors: Yang Cai, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng,
- Abstract summary: We provide an uncoupled policy optimization algorithm that attains a near-optimal $tildeO(T-1)$ convergence rate for computing a correlated equilibrium.
Our algorithm is constructed by combining two main elements (i.e. smooth value updates and (ii. the optimistic-follow-the-regularized-leader algorithm with the log barrier regularizer)
- Score: 44.95137108337898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study policy optimization algorithms for computing correlated equilibria in multi-player general-sum Markov Games. Previous results achieve $O(T^{-1/2})$ convergence rate to a correlated equilibrium and an accelerated $O(T^{-3/4})$ convergence rate to the weaker notion of coarse correlated equilibrium. In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal $\tilde{O}(T^{-1})$ convergence rate for computing a correlated equilibrium. Our algorithm is constructed by combining two main elements (i) smooth value updates and (ii) the optimistic-follow-the-regularized-leader algorithm with the log barrier regularizer.
Related papers
- Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs [55.77845440440496]
Push-based decentralized communication enables optimization over communication networks, where information exchange may be asymmetric.<n>We develop a unified uniform-stability framework for the Gradient Push (SGP) algorithm.<n>A key technical ingredient is an imbalance-aware generalization bound through two quantities.
arXiv Detail & Related papers (2026-02-24T05:32:03Z) - Near Optimal Convergence to Coarse Correlated Equilibrium in General-Sum Markov Games [0.0]
No-regret learning dynamics play a central role in game theory, enabling decentralized convergence to equilibrium.<n>We improve the convergence rate to CCE in general-sum games, reducing it from the previously best-known rate of $mathcalO(log5 T / T)$ to a sharper $mathcalO(log T / T)$.<n>This matches the best known convergence rate for CE in terms of $T$, number of iterations, while also improving the dependence on the action set size.
arXiv Detail & Related papers (2025-11-04T00:54:54Z) - Distributed Optimization via Energy Conservation Laws in Dilated Coordinates [5.35599092568615]
This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates.
convergence rates can be explicitly expressed in terms of the inverse time-dilation factor.
Its accelerated convergence behavior is benchmarked against various state-of-the-art distributed optimization algorithms on practical, large-scale problems.
arXiv Detail & Related papers (2024-09-28T08:02:43Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - Policy Optimization for Markov Games: Unified Framework and Faster
Convergence [81.3266426402464]
We show that the state-wise average policy of this algorithm converges to an approximate Nash equilibrium (NE) of the game.
We extend this algorithm to multi-player general Markov Games and show a $mathcalwidetildeO(T-1/2)$ convergence rate to Correlated Equilibria (CCE)
arXiv Detail & Related papers (2022-06-06T14:23:13Z) - Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation [78.48747645545944]
We study the problem of finding optimal correlated equilibria of various sorts in extensive-form games.
We introduce a new algorithm for computing optimal equilibria in all three notions.
arXiv Detail & Related papers (2022-03-14T15:21:18Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.