Fugu-MT 論文翻訳(概要): On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games

論文の概要: On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games

arxiv url: http://arxiv.org/abs/2410.23398v1
Date: Wed, 30 Oct 2024 19:03:33 GMT
ステータス: 翻訳完了
システム内更新日: 2024-11-28 17:07:42.547927
Title: On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games
Title（参考訳）: オンライン学習における拡張エントロピーと下界の最適性について
Authors: Zhiyuan Fan, Christian Kroer, Gabriele Farina,
Abstract要約: 1次法は、大規模な広角ゲームにおいて最もスケーラブルな平衡計算アルゴリズムであることは間違いない。戦略の正則化として機能する距離生成関数を選択する必要がある。重み付き拡張エントロピー(DilEnt)距離生成関数が対数因子に最適であることを示す。
参考スコア（独自算出の注目度）: 44.861519860614735
License: http://creativecommons.org/licenses/by/4.0/
Abstract: First-order methods (FOMs) are arguably the most scalable algorithms for equilibrium computation in large extensive-form games. To operationalize these methods, a distance-generating function, acting as a regularizer for the strategy space, must be chosen. The ratio between the strong convexity modulus and the diameter of the regularizer is a key parameter in the analysis of FOMs. A natural question is then: what is the optimal distance-generating function for extensive-form decision spaces? In this paper, we make a number of contributions, ultimately establishing that the weight-one dilated entropy (DilEnt) distance-generating function is optimal up to logarithmic factors. The DilEnt regularizer is notable due to its iterate-equivalence with Kernelized OMWU (KOMWU) -- the algorithm with state-of-the-art dependence on the game tree size in extensive-form games -- when used in conjunction with the online mirror descent (OMD) algorithm. However, the standard analysis for OMD is unable to establish such a result; the only current analysis is by appealing to the iterate equivalence to KOMWU. We close this gap by introducing a pair of primal-dual treeplex norms, which we contend form the natural analytic viewpoint for studying the strong convexity of DilEnt. Using these norm pairs, we recover the diameter-to-strong-convexity ratio that predicts the same performance as KOMWU. Along with a new regret lower bound for online learning in sequence-form strategy spaces, we show that this ratio is nearly optimal. Finally, we showcase our analytic techniques by refining the analysis of Clairvoyant OMD when paired with DilEnt, establishing an $\mathcal{O}(n \log |\mathcal{V}| \log T/T)$ approximation rate to coarse correlated equilibrium in $n$-player games, where $|\mathcal{V}|$ is the number of reduced normal-form strategies of the players, establishing the new state of the art.
Abstract（参考訳）: 1次法(FOMs)は、大きな広義のゲームにおける平衡計算の最もスケーラブルなアルゴリズムである。これらの手法を運用するには、戦略空間の正規化器として機能する距離生成関数を選択する必要がある。強凸率率と正則化器の直径との比はFOMの解析において重要なパラメータである。より広範な形式決定空間に対する最適距離生成関数とは何か? 本稿では,重み付き拡張エントロピー(DilEnt)距離生成関数が対数因子に最適であることを示す。 DilEnt正規化器は、オンラインミラー降下(OMD)アルゴリズムと組み合わせて使用する場合、Kernelized OMWU (KOMWU) -- 広範な形式のゲームにおいて、ゲームツリーサイズに最先端に依存するアルゴリズム -- との反復等価性によって注目されている。しかし、OMDの標準解析ではそのような結果が得られず、現在の分析はKoMWUの反復同値性に訴えることによるものである。我々はこのギャップを、DilEntの強い凸性を研究するための自然な解析的視点として、一対の原始双対ツリープレックスノルムを導入することによって埋める。これらの標準ペアを用いて,KoMWUと同じ性能を予測できる直径-強度-凸比を復元する。シーケンス形式の戦略空間におけるオンライン学習に対する新たな後悔の低減とともに、この比率がほぼ最適であることを示す。最後に、DilEnt と組み合わせた場合の Clairvoyant OMD の分析を精査し、$n$-player ゲームにおいて相関平衡を粗くするために $\mathcal{O}(n \log |\mathcal{V}| \log T/T)$ 近似率を定め、そこで $|\mathcal{V}|$ はプレイヤーの正規形式戦略の減少数であり、新しい最先端技術を確立する。

論文の概要: On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games

関連論文リスト