Fugu-MT 論文翻訳(概要): Convergence of Regret Matching in Potential Games and Constrained Optimization

論文の概要: Convergence of Regret Matching in Potential Games and Constrained Optimization

arxiv url: http://arxiv.org/abs/2510.17067v1
Date: Mon, 20 Oct 2025 00:45:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 00:56:39.274433
Title: Convergence of Regret Matching in Potential Games and Constrained Optimization
Title（参考訳）: ポテンシャルゲームにおけるレギュレットマッチングの収束と制約付き最適化
Authors: Ioannis Anagnostides, Emanuel Tewolde, Brian Hu Zhang, Ioannis Panageas, Vincent Conitzer, Tuomas Sandholm,
Abstract要約: RM$+$の交互収束は、$O_epsilon (1/epsilon4)$の後に$Epsilon$-KKT点に収束し、それが音で高速な一階数であることを示す。我々の下界は、ポテンシャルゲームにおける粗相関平衡への収束が、ナッシュ平衡への収束よりも指数関数的に速いことを示している。
参考スコア（独自算出の注目度）: 85.55969013318627
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Regret matching (RM} -- and its modern variants -- is a foundational online algorithm that has been at the heart of many AI breakthrough results in solving benchmark zero-sum games, such as poker. Yet, surprisingly little is known so far in theory about its convergence beyond two-player zero-sum games. For example, whether regret matching converges to Nash equilibria in potential games has been an open problem for two decades. Even beyond games, one could try to use RM variants for general constrained optimization problems. Recent empirical evidence suggests that they -- particularly regret matching$^+$ (RM$^+$) -- attain strong performance on benchmark constrained optimization problems, outperforming traditional gradient descent-type algorithms. We show that alternating RM$^+$ converges to an $\epsilon$-KKT point after $O_\epsilon(1/\epsilon^4)$ iterations, establishing for the first time that it is a sound and fast first-order optimizer. Our argument relates the KKT gap to the accumulated regret, two quantities that are entirely disparate in general but interact in an intriguing way in our setting, so much so that when regrets are bounded, our complexity bound improves all the way to $O_\epsilon(1/\epsilon^2)$. From a technical standpoint, while RM$^+$ does not have the usual one-step improvement property in general, we show that it does in a certain region that the algorithm will quickly reach and remain in thereafter. In sharp contrast, our second main result establishes a lower bound: RM, with or without alternation, can take an exponential number of iterations to reach a crude approximate solution even in two-player potential games. This represents the first worst-case separation between RM and RM$^+$. Our lower bound shows that convergence to coarse correlated equilibria in potential games is exponentially faster than convergence to Nash equilibria.
Abstract（参考訳）: レグレトマッチング(RM)は、ポーカーのようなゼロサムゲームのベンチマークを解くための多くのAIブレークスルーの結果の中心にある、基礎的なオンラインアルゴリズムである。しかし、これまでの理論では2人プレイのゼロサムゲームを超えて収束することについては、驚くほどほとんど分かっていない。例えば、潜在的なゲームにおいて、後悔の一致がナッシュ均衡に収束するかどうかは、20年間、未解決の問題であった。ゲームを超えても、一般的な制約付き最適化問題にRMの変種を使おうとすることが出来る。最近の実証的な証拠は、特に後悔する$^+$ (RM$^+$) は、ベンチマーク制約された最適化問題に対して強い性能を示し、従来の勾配降下型アルゴリズムよりも優れていたことを示唆している。 RM$^+$の交互化は、O_\epsilon(1/\epsilon^4)$の反復の後、$\epsilon$-KKTの点に収束し、それが音響的で高速な一階オプティマイザであることを示す。我々の議論は、KKTギャップが蓄積された後悔と、一般的には全く異なるが、我々の設定において興味深い方法で相互作用する2つの量に関係しているので、後悔が束縛されたとき、我々の複雑性境界は$O_\epsilon(1/\epsilon^2)$まで改善される。技術的な見地からすると、RM$^+$は一般的な一段階改善特性を持たないが、アルゴリズムがすぐに到達し、その後に残るような特定の領域で行われることを示す。シャープな対照的に、我々の第2の主結果は低い境界を定めている: RMは、変更の有無に関わらず、指数関数的な回数の反復を要し、2プレーヤのポテンシャルゲームでも粗い近似解に達する。これはRMとRM$^+$の最初の最悪のケース分離である。我々の下界は、ポテンシャルゲームにおける粗相関平衡への収束が、ナッシュ平衡への収束よりも指数関数的に速いことを示している。

論文の概要: Convergence of Regret Matching in Potential Games and Constrained Optimization

関連論文リスト