Fugu-MT 論文翻訳(概要): Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

論文の概要: Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

arxiv url: http://arxiv.org/abs/2406.10631v1
Date: Sat, 15 Jun 2024 13:26:17 GMT
ステータス: 翻訳完了
システム内更新日: 2024-06-18 23:33:44.132550
Title: Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
Title（参考訳）: ゲームにおける学習の最終段階の高速収束には、忘れられたアルゴリズムが必要である
Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng,
Abstract要約: オンライン学習によるセルフプレイは、大規模な2人プレイのゼロサムゲームを解くための重要な方法の1つだ。我々は,OMWUが支払行列のサイズに対数依存するなど,いくつかの利点があることを示した。我々は、過去のことをすぐに忘れない幅広い種類のアルゴリズムが、すべて同じ問題に悩まされていることを証明している。
参考スコア（独自算出の注目度）: 71.73971094342349
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix and $\widetilde{O}(1/T)$ convergence to coarse correlated equilibria even in general-sum games. However, in terms of last-iterate convergence in two-player zero-sum games, an increasingly popular topic in this area, OGDA guarantees that the duality gap shrinks at a rate of $O(1/\sqrt{T})$, while the best existing last-iterate convergence for OMWU depends on some game-dependent constant that could be arbitrarily large. This begs the question: is this potentially slow last-iterate convergence an inherent disadvantage of OMWU, or is the current analysis too loose? Somewhat surprisingly, we show that the former is true. More generally, we prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue: for any arbitrarily small $\delta>0$, there exists a $2\times 2$ matrix game such that the algorithm admits a constant duality gap even after $1/\delta$ rounds. This class of algorithms includes OMWU and other standard optimistic follow-the-regularized-leader algorithms.
Abstract（参考訳）: オンライン学習によるセルフプレイは、理論と実践の両方において、大規模な2プレイヤーゼロサムゲームを解くための重要な方法の1つである。特に一般的なアルゴリズムには、楽観的乗法重み更新 (OMWU) と楽観的勾配降下 (OGDA) がある。どちらのアルゴリズムも、2つのプレイヤーゼロサムゲームにおけるナッシュ均衡へのエルゴード収束を$O(1/T)$ ergodic convergence としているが、OMWUは、ペイオフ行列のサイズに対する対数依存や、一般サムゲームにおいても粗相関平衡への$\widetilde{O}(1/T)$ convergence などの利点がある。しかし、この領域でますます人気が高まっている2つのプレイヤーゼロサムゲームにおいて、OGDAは双対性ギャップが$O(1/\sqrt{T})$で縮まることを保証している。この潜在的に遅い最終段階の収束は、OMWUの固有の不利なのか、それとも現在の分析があまりに緩いのか? 驚くべきことに、私たちは前者が真実であることを示します。より一般に、過去を早く忘れない幅広いアルゴリズムのクラスは、すべて同じ問題に悩まされていることを証明している:任意の任意に小さい$\delta>0$に対して、そのアルゴリズムが1/\delta$ラウンドの後にも一定の双対性ギャップを許容する2.2\times 2$行列ゲームが存在する。このアルゴリズムには、OMWUや他の標準的な楽観的追従型リーダーアルゴリズムが含まれる。

論文の概要: Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

関連論文リスト