Fugu-MT 論文翻訳(概要): Logistic Regression Regret: What's the Catch?

論文の概要: Logistic Regression Regret: What's the Catch?

arxiv url: http://arxiv.org/abs/2002.02950v2
Date: Wed, 19 Feb 2020 15:27:53 GMT
ステータス: 翻訳完了
システム内更新日: 2023-01-03 03:59:28.529511
Title: Logistic Regression Regret: What's the Catch?
Title（参考訳）: Logistic Regression Regret: キャッチは何?
Authors: Gil I. Shamir
Abstract要約: パラメータに対する$L_infty$制約の下で、対数的後悔を伴う下界を導出する。 L$の制約については、十分に大きな$d$の場合、後悔は$d$で線形であるが、$T$で対数化されなくなることが示されている。
参考スコア（独自算出の注目度）: 3.7311680121118345
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the problem of the achievable regret rates with online logistic regression. We derive lower bounds with logarithmic regret under $L_1$, $L_2$, and $L_\infty$ constraints on the parameter values. The bounds are dominated by $d/2 \log T$, where $T$ is the horizon and $d$ is the dimensionality of the parameter space. We show their achievability for $d=o(T^{1/3})$ in all these cases with Bayesian methods, that achieve them up to a $d/2 \log d$ term. Interesting different behaviors are shown for larger dimensionality. Specifically, on the negative side, if $d = \Omega(\sqrt{T})$, any algorithm is guaranteed regret of $\Omega(d \log T)$ (greater than $\Omega(\sqrt{T})$) under $L_\infty$ constraints on the parameters (and the example features). On the positive side, under $L_1$ constraints on the parameters, there exist algorithms that can achieve regret that is sub-linear in $d$ for the asymptotically larger values of $d$. For $L_2$ constraints, it is shown that for large enough $d$, the regret remains linear in $d$ but no longer logarithmic in $T$. Adapting the redundancy-capacity theorem from information theory, we demonstrate a principled methodology based on grids of parameters to derive lower bounds. Grids are also utilized to derive some upper bounds. Our results strengthen results by Kakade and Ng (2005) and Foster et al. (2018) for upper bounds for this problem, introduce novel lower bounds, and adapt a methodology that can be used to obtain such bounds for other related problems. They also give a novel characterization of the asymptotic behavior when the dimension of the parameter space is allowed to grow with $T$. They additionally establish connections to the information theory literature, demonstrating that the actual regret for logistic regression depends on the richness of the parameter class, where even within this problem, richer classes lead to greater regret.
Abstract（参考訳）: オンラインロジスティック回帰による達成可能な後悔率の問題に対処する。パラメータ値に対する$L_1$,$L_2$,$L_\infty$制約の下で、対数的後悔を伴う下界を導出する。境界は $d/2 \log t$ で支配され、ここで $t$ は地平線、$d$ はパラメータ空間の次元である。これらすべてのケースにおいて、ベイズ法で$d=o(t^{1/3})$の達成可能性を示し、最大$d/2 \log d$項を達成する。興味深い異なる行動は、より大きな次元性を示す。具体的には、負の面において、$d = \Omega(\sqrt{T})$ならば、任意のアルゴリズムは、パラメータ(および例の特徴)に対する制約で$\Omega(\sqrt{T})$よりも大きい)$\Omega(\sqrt{T})$の後悔が保証される。正の面では、パラメータの$L_1$制約の下で、漸近的に大きい$d$に対して$d$のサブ線形である後悔を達成できるアルゴリズムが存在する。 L_2$制約の場合、十分大きな$d$の場合、後悔は$d$で線形であるが、$T$で対数化されなくなる。情報理論から冗長性・キャパシティの定理を適応させることで,パラメータの格子に基づく原理的手法を導出する。格子は上界を導出するためにも用いられる。この問題に対する上限について,Kakade and Ng (2005) と Foster et al. (2018) による結果を強化し,新しい下位境界を導入し,他の関連する問題に対してそのような境界を求める手法を適用した。また、パラメータ空間の次元が$t$で成長することを許されたとき、漸近的挙動の新たな特徴付けを与える。さらに彼らは情報理論文献とのつながりを確立し、ロジスティック回帰に対する実際の後悔はパラメータクラスの豊かさに依存することを示した。

関連論文リスト

Online Inverse Linear Optimization: Improved Regret Bound, Robustness to Suboptimality, and Toward Tight Regret Analysis [25.50155563108198]
本稿では,学習者が時間変化の可能な行動群とエージェントの最適な行動群の両方を観察するオンライン学習問題について検討する。我々は、以前の$O(n4ln T)$の限界を$n3$の係数で改善した$O(nln T)$後悔境界を得る。
論文参考訳（メタデータ） (2025-01-24T09:19:15Z)
Reinforcement Learning and Regret Bounds for Admission Control [0.08192907805418585]
強化学習アルゴリズムの期待された後悔は、未計算の戻り値に対して$Omegaleft(sqrtDXATright)$で制限される。 UCRL2にインスパイアされたアルゴリズムを提案し、その問題の構造を用いて、有限サーバの場合、$O(Slog T + sqrtmT log T)$で期待される全後悔を上位に制限する。
論文参考訳（メタデータ） (2024-06-07T09:09:14Z)
Batched Stochastic Bandit for Nondegenerate Functions [8.015503209312786]
本稿では,非退化関数に対するバッチ帯域学習問題について検討する。本稿では,非退化関数に対するバッチバンドイット問題をほぼ最適に解くアルゴリズムを提案する。
論文参考訳（メタデータ） (2024-05-09T12:50:16Z)
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits [38.41164102066483]
本研究では、独立かつ同一に分散したコンテキストを持つ線形文脈帯域問題について考察する。提案アルゴリズムは、Tsallisエントロピーを持つFollow-The-Regularized-Leaderに基づいており、$alpha$-textual-Con (LC)-Tsallis-INFと呼ばれている。
論文参考訳（メタデータ） (2024-03-05T18:59:47Z)
Optimal Sketching Bounds for Sparse Linear Regression [116.30196615349226]
我々は、$ell_p$ノルムや広範なヒンジ様損失関数のクラスから、様々な損失関数の下で、$k$スパース線形回帰の難読スケッチを研究する。スパース$ell$varepsレグレッションの場合、$Theta(klog(d/k)/varepsilon2)$ rowsでスケッチの上に曖昧な分布が存在し、これは定数要素に固執することを示している。また、$O(mu2 klog(mun d/varepsilon)/varのスケッチも示します。
論文参考訳（メタデータ） (2023-04-05T07:24:19Z)
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes [62.90204655228324]
我々は,後期マルコフ決定過程(LMDP)における強化学習(RL)の文脈を考慮した後悔の最小化について検討した。我々は,モデル最適化と値最適化の両手法でインスタンス化できる,新しいモデルベースアルゴリズムフレームワークを設計する。
論文参考訳（メタデータ） (2022-10-20T21:32:01Z)
On learning Whittle index policy for restless bandits with scalable regret [1.0152838128195465]
強化学習は、優れたリソース割り当てとスケジューリングポリシーを学ぶための魅力的なアプローチである。ほとんどのRLアルゴリズムの累積後悔は$tilde O(mathsfS sqrtmathsfA T)$、$mathsfS$は状態空間のサイズである。本稿では,このような問題に対するモデルベースRLアルゴリズムを提案する。
論文参考訳（メタデータ） (2022-02-07T19:07:02Z)
Minimal Expected Regret in Linear Quadratic Control [79.81807680370677]
オンライン学習アルゴリズムを考案し、その期待された後悔を保証します。当時のこの後悔は、$A$と$B$が未知の場合、$widetildeO((d_u+d_x)sqrtd_xT)$によって上界(i)となる。
論文参考訳（メタデータ） (2021-09-29T14:07:21Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
対数要因まで $(varepsilon,delta)$-differently private の最適過剰人口損失は $sqrtlog(d)/n + sqrtd/varepsilon n.$ です。損失関数がさらなる滑らかさの仮定を満たすとき、余剰損失は$sqrtlog(d)/n + (log(d)/varepsilon n)2/3で上界(対数因子まで)であることが示される。
論文参考訳（メタデータ） (2021-03-02T06:53:44Z)
Logarithmic Regret for Reinforcement Learning with Linear Function Approximation [99.59319332864129]
最近提案された2つの線形MDP仮定で対数的後悔が達成可能であることを示す。我々の知る限り、これらは線型関数近似を持つRLに対する最初の対数的後悔境界である。
論文参考訳（メタデータ） (2020-11-23T17:25:00Z)
$Q$-learning with Logarithmic Regret [60.24952657636464]
楽観的な$Q$は$mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of state, $A$ is the number of action, $H$ is the planning horizon, $T$ is the total number of steps, $Delta_min$ is the least sub-Optitimality gap。
論文参考訳（メタデータ） (2020-06-16T13:01:33Z)
Naive Exploration is Optimal for Online LQR [49.681825576239355]
最適後悔尺度は$widetildeTheta(sqrtd_mathbfu2 d_mathbfx T)$で、$T$は時間ステップの数、$d_mathbfu$は入力空間の次元、$d_mathbfx$はシステム状態の次元である。我々の下界は、かつての$mathrmpoly(logT)$-regretアルゴリズムの可能性を排除する。
論文参考訳（メタデータ） (2020-01-27T03:44:54Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。