Fugu-MT 論文翻訳(概要): Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability

論文の概要: Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability

arxiv url: http://arxiv.org/abs/2502.18582v1
Date: Tue, 25 Feb 2025 19:08:26 GMT
ステータス: 翻訳完了
システム内更新日: 2025-02-27 15:24:46.285265
Title: Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability
Title（参考訳）: トラクタビリティの最前線における$$-Equilibriaの学習と計算
Authors: Brian Hu Zhang, Ioannis Anagnostides, Emanuel Tewolde, Ratip Emin Berker, Gabriele Farina, Vincent Conitzer, Tuomas Sandholm,
Abstract要約: $Phi$-equilibriaは、オンライン学習とゲーム理論の中心にある、強力で柔軟なフレームワークだ。効率的なオンラインアルゴリズムは、$textpoly(d, k)/epsilon2$ラウンドを使用して、平均$Phi$-regretを最大$epsilon$で生成することを示す。また、オンライン設定において、ほぼ一致した下限を示し、その結果、$Phi$-regretの学習可能性を取得する偏差の族が初めて得られる。
参考スコア（独自算出の注目度）: 85.07238533644636
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: $\Phi$-equilibria -- and the associated notion of $\Phi$-regret -- are a powerful and flexible framework at the heart of online learning and game theory, whereby enriching the set of deviations $\Phi$ begets stronger notions of rationality. Recently, Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC '24) -- abbreviated as DFFPS -- settled the existence of efficient algorithms when $\Phi$ contains only linear maps under a general, $d$-dimensional convex constraint set $\mathcal{X}$. In this paper, we significantly extend their work by resolving the case where $\Phi$ is $k$-dimensional; degree-$\ell$ polynomials constitute a canonical such example with $k = d^{O(\ell)}$. In particular, positing only oracle access to $\mathcal{X}$, we obtain two main positive results: i) a $\text{poly}(n, d, k, \text{log}(1/\epsilon))$-time algorithm for computing $\epsilon$-approximate $\Phi$-equilibria in $n$-player multilinear games, and ii) an efficient online algorithm that incurs average $\Phi$-regret at most $\epsilon$ using $\text{poly}(d, k)/\epsilon^2$ rounds. We also show nearly matching lower bounds in the online learning setting, thereby obtaining for the first time a family of deviations that captures the learnability of $\Phi$-regret. From a technical standpoint, we extend the framework of DFFPS from linear maps to the more challenging case of maps with polynomial dimension. At the heart of our approach is a polynomial-time algorithm for computing an expected fixed point of any $\phi : \mathcal{X} \to \mathcal{X}$ based on the ellipsoid against hope (EAH) algorithm of Papadimitriou and Roughgarden (JACM '08). In particular, our algorithm for computing $\Phi$-equilibria is based on executing EAH in a nested fashion -- each step of EAH itself being implemented by invoking a separate call to EAH.
Abstract（参考訳）: $\Phi$-equilibria -- and the associated concept of $\Phi$-regret -- はオンライン学習とゲーム理論の中心にある強力で柔軟なフレームワークであり、逸脱の集合を豊かにする$\Phi$は合理性の概念を強くする。最近、Daskalakis, Farina, Fishelson, Pipis, Schneider (STOC '24) -- DFFPSと略される -- は、DFFPSと呼ばれる効率的なアルゴリズムの存在を解決した。本稿では、$\Phi$ が $k$-次元である場合、次数-$\ell$ 多項式が $k = d^{O(\ell)}$ であるような標準的な例を構成する場合を解くことで、それらの作業を大幅に拡張する。特に、$\mathcal{X}$へのオラクルアクセスのみを仮定すると、2つの主な正の結果が得られる。 i)$\text{poly}(n, d, k, \text{log}(1/\epsilon))$-time algorithm for computing $\epsilon$-approximate $\Phi$-equilibria in $n$-player multilinear games, and and i) 平均$\Phi$-regret at most $\epsilon$ using $\text{poly}(d, k)/\epsilon^2$ rounds。また、オンライン学習環境において、ほぼ一致した下限を示し、これにより、$\Phi$-regretの学習可能性を取得する偏差の家族を初めて得ることができる。技術的な観点からは、DFFPSの枠組みを線型写像から多項式次元の写像のより困難な場合へと拡張する。我々のアプローチの核心は、Papadimitriou と Roughgarden (JACM '08) のエルプソイド対希望 (EAH) アルゴリズムに基づく任意の$\phi : \mathcal{X} \to \mathcal{X}$の期待定点を計算する多項式時間アルゴリズムである。特に、$\Phi$-equilibriaの計算アルゴリズムは、EAHをネストした方法で実行することに基づいています。

関連論文リスト

Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
線形スケッチを用いた行列とベクトルノルムの残差誤差推定問題について検討する。これは、前作とほぼ同じスケッチサイズと精度で、経験的にかなり有利であることを示す。また、スパースリカバリ問題に対して$Omega(k2/pn1-2/p)$低いバウンダリを示し、これは$mathrmpoly(log n)$ factorまで厳密である。
論文参考訳（メタデータ） (2024-08-16T02:33:07Z)
Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
未知のインデックス設定における自己選択バイアスを伴う線形回帰の問題を考察する。我々は,$mathbfw_1,ldots,mathbfw_kinを復元する,新しい,ほぼ最適なサンプル効率($k$)アルゴリズムを提案する。このアルゴリズムは雑音の仮定をかなり緩めることに成功し、従って関連する最大線形回帰の設定にも成功している。
論文参考訳（メタデータ） (2024-02-22T02:20:24Z)
A quasi-polynomial time algorithm for Multi-Dimensional Scaling via LP hierarchies [34.7582575446942]
多次元スケーリング(MDS)は、低次元ユークリッド空間に$n$ポイントの計量を埋め込む方法のファミリーである。準多項式依存のMDSに対する最初の近似アルゴリズムは$Delta$である。
論文参考訳（メタデータ） (2023-11-29T17:42:05Z)
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions [44.655316553524855]
我々は,$d$次元ユークリッド領域あるいは単純領域上で$max_iin[n] f_i(x) を最小化するアルゴリズムを設計する。それぞれの$f_i$が1ドルLipschitzと1ドルSmoothのとき、我々の手法は$epsilon-approximateの解を計算する。
論文参考訳（メタデータ） (2023-11-17T22:07:18Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
正方形損失に関して、標準的なガウス分布の下での$k$ReLU活性化の線形結合をPAC学習する問題をmathbbRd$で検討する。本研究の主な成果は,この学習課題に対して,サンプルおよび計算複雑性が$(dk/epsilon)O(k)$で,epsilon>0$が目標精度である。
論文参考訳（メタデータ） (2023-07-24T14:37:22Z)
Memory-Constrained Algorithms for Convex Optimization via Recursive Cutting-Planes [23.94542304111204]
勾配降下法と切断平面法の間に正のトレードオフを与えるアルゴリズムの第一級は、$epsilonleq 1/sqrt d$である。規則$epsilon leq d-Omega(d)$では、$p=d$のアルゴリズムが情報理論の最適メモリ利用を実現し、勾配降下のオラクル-複雑度を改善する。
論文参考訳（メタデータ） (2023-06-16T17:00:51Z)
Online Sub-Sampling for Reinforcement Learning with General Function Approximation [111.01990889581243]
本稿では,RLアルゴリズムによって収集されたデータポイントの情報取得量を測定する,効率的なオンラインサブサンプリングフレームワークを確立する。複雑性バウンド関数クラスを持つ値ベースのメソッドの場合、$proptooperatornamepolylog(K)$ timesに対してのみポリシーを更新する必要がある。少なくとも$Omega(K)$倍のポリシーを更新する既存のアプローチとは対照的に、当社のアプローチはポリシーの解決における最適化コールの数を劇的に削減します。
論文参考訳（メタデータ） (2021-06-14T07:36:25Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
我々は,バンディットフィードバックを用いてオンライン学習を学習する。 learnerは、コスト/リワード関数が"pseudo-1d"構造を許可するゼロ次オラクルのみにアクセスできる。我々は、$T$がラウンドの数である任意のアルゴリズムの後悔のために$min(sqrtdT、T3/4)$の下限を示しています。ランダム化オンライングラデーション下降とカーネル化指数重み法を組み合わせた新しいアルゴリズムsbcalgを提案し,疑似-1d構造を効果的に活用する。
論文参考訳（メタデータ） (2021-02-15T08:16:51Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
我々は、s$ of cardinality $m = (k/epsilon)o_d(k1/d)$ に対して $epsilon$-cover が存在することを示す。構造的結果に基づいて,いくつかの基本的高次元確率モデル隠れ変数の学習アルゴリズムを改良した。
論文参考訳（メタデータ） (2020-12-14T18:14:08Z)
Near-Optimal Reinforcement Learning with Self-Play [50.29853537456737]
我々は,直接の監督なしに自己対決で最適な政策を学習するセルフプレイアルゴリズムに焦点をあてる。本稿では,サンプル複雑性を$tildemathcalO(SAB)$,サンプル複雑性を$tildemathcalO(S(A+B)$とする新しいemphNash Vラーニングアルゴリズムを提案する。
論文参考訳（メタデータ） (2020-06-22T05:00:13Z)
An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications [28.54236118020831]
本稿では,最適な$O(n log (kappa))$オアラーク評価を用いた新しい切削平面アルゴリズムを提案する。また、評価毎に$n2$の時間を改善できないため、実行時間が最適であることを示す。
論文参考訳（メタデータ） (2020-04-08T20:56:40Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。