Fugu-MT 論文翻訳(概要): Exponential Weights Algorithms for Selective Learning

論文の概要: Exponential Weights Algorithms for Selective Learning

arxiv url: http://arxiv.org/abs/2106.15662v1
Date: Tue, 29 Jun 2021 18:14:01 GMT
ステータス: 翻訳完了
システム内更新日: 2021-07-01 12:33:38.788194
Title: Exponential Weights Algorithms for Selective Learning
Title（参考訳）: 選択学習のための指数重みアルゴリズム
Authors: Mingda Qiao, Gregory Valiant
Abstract要約: 本稿では,Qiao と Valiant が導入した選択学習問題について考察する。学習者が引き起こす余剰リスクは、これらの$w$のデータポイントに対する$hatellの損失の平均値の差として定義される。また,長さ$w$の予測ウィンドウが選択された場合,学習者の判断は最新の$w$のデータポイントにのみ依存する。
参考スコア（独自算出の注目度）: 39.334633118161285
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the selective learning problem introduced by Qiao and Valiant (2019), in which the learner observes $n$ labeled data points one at a time. At a time of its choosing, the learner selects a window length $w$ and a model $\hat\ell$ from the model class $\mathcal{L}$, and then labels the next $w$ data points using $\hat\ell$. The excess risk incurred by the learner is defined as the difference between the average loss of $\hat\ell$ over those $w$ data points and the smallest possible average loss among all models in $\mathcal{L}$ over those $w$ data points. We give an improved algorithm, termed the hybrid exponential weights algorithm, that achieves an expected excess risk of $O((\log\log|\mathcal{L}| + \log\log n)/\log n)$. This result gives a doubly exponential improvement in the dependence on $|\mathcal{L}|$ over the best known bound of $O(\sqrt{|\mathcal{L}|/\log n})$. We complement the positive result with an almost matching lower bound, which suggests the worst-case optimality of the algorithm. We also study a more restrictive family of learning algorithms that are bounded-recall in the sense that when a prediction window of length $w$ is chosen, the learner's decision only depends on the most recent $w$ data points. We analyze an exponential weights variant of the ERM algorithm in Qiao and Valiant (2019). This new algorithm achieves an expected excess risk of $O(\sqrt{\log |\mathcal{L}|/\log n})$, which is shown to be nearly optimal among all bounded-recall learners. Our analysis builds on a generalized version of the selective mean prediction problem in Drucker (2013); Qiao and Valiant (2019), which may be of independent interest.
Abstract（参考訳）: Qiao と Valiant が導入した選択学習問題 (2019) について検討し, 学習者はラベル付きデータポイントを1度に1ドルずつ観察する。選択した時点で、学習者は、モデルクラス $\mathcal{l}$ からウィンドウ長 $w$ とモデル $\hat\ell$ を選択し、次の$w$ データポイントを $\hat\ell$ でラベル付けする。学習者が引き起こす余剰リスクは、これらの$w$データポイントに対する$\hat\ell$の平均損失と、これらの$w$データポイントに対する$\mathcal{L}$における全てのモデルにおける最小の平均損失との差として定義される。ハイブリッド指数重みアルゴリズム (hybrid exponential weights algorithm) と呼ばれる改良アルゴリズムは、$o((\log\log|\mathcal{l}| + \log\log n)/\log n)$ の余剰リスクを達成する。この結果は、最もよく知られた$O(\sqrt{|\mathcal{L}|/\log n})$に対する$|\mathcal{L}|$への依存を2倍指数的に改善する。正の結果をほぼ一致する下限で補うため、アルゴリズムの最悪の最適性が示唆される。また,長さ$w$の予測ウィンドウが選択された場合,学習者の判断は最新の$w$データポイントにのみ依存するという意味で,より限定的な学習アルゴリズムについても検討した。我々は,Qiao and Valiant (2019)におけるERMアルゴリズムの指数重み変量解析を行った。この新しいアルゴリズムは、期待過剰なリスクである$O(\sqrt{\log |\mathcal{L}|/\log n})$を達成し、すべての有界リコール学習者の間でほぼ最適であることが示されている。我々の分析は、Drucker (2013), Qiao and Valiant (2019) における選択平均予測問題の一般化版に基づく。

関連論文リスト

Improved Robust Estimation for Erdős-Rényi Graphs: The Sparse Regime and Optimal Breakdown Point [3.793609515750114]
我々は、ErdHos-R'enyiランダムグラフのエッジ密度を強く推定する問題を、$G(n, dcirc/n)$で検討する。我々のアルゴリズムは2乗の総和階層に基づいている。
論文参考訳（メタデータ） (2025-03-05T21:45:17Z)
A New Rejection Sampling Approach to $k$-$\mathtt{means}$++ With Improved Trade-Offs [0.12289361708127876]
単純かつ効果的なリジェクションサンプリングに基づくアプローチで,$k$-$mathttmeans$++ を高速化する。最初のメソッドは $tildeO(mathttnnz (mathcalX) + beta k2d)$ で実行されます。第2の手法は,計算コストと解品質の新たなトレードオフを示す。
論文参考訳（メタデータ） (2025-02-04T08:05:34Z)
Iterative thresholding for non-linear learning in the strong $\varepsilon$-contamination model [3.309767076331365]
閾値降下を用いた単一ニューロンモデル学習のための近似境界を導出する。線形回帰問題も研究し、$sigma(mathbfx) = mathbfx$ となる。
論文参考訳（メタデータ） (2024-09-05T16:59:56Z)
Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method [10.742859956268655]
本稿では,レバレッジスコア勾配から固有モデルパラメータを復元することを目的とする。具体的には、レバレッジスコア勾配の逆転を$g(x)$として精査する。
論文参考訳（メタデータ） (2024-08-21T01:39:42Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
正方形損失に関して、標準的なガウス分布の下での$k$ReLU活性化の線形結合をPAC学習する問題をmathbbRd$で検討する。本研究の主な成果は,この学習課題に対して,サンプルおよび計算複雑性が$(dk/epsilon)O(k)$で,epsilon>0$が目標精度である。
論文参考訳（メタデータ） (2023-07-24T14:37:22Z)
Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation [25.60689712525918]
本稿では,遷移確率と報酬関数が線形な線形関数近似を用いた強化学習について検討する。本稿では,新たなアルゴリズムLSVI-UCB$+$を提案し,$H$がエピソード長,$d$が特徴次元,$T$がステップ数である場合に,$widetildeO(HdsqrtT)$ regretboundを実現する。
論文参考訳（メタデータ） (2022-06-23T06:04:21Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
対数要因まで $(varepsilon,delta)$-differently private の最適過剰人口損失は $sqrtlog(d)/n + sqrtd/varepsilon n.$ です。損失関数がさらなる滑らかさの仮定を満たすとき、余剰損失は$sqrtlog(d)/n + (log(d)/varepsilon n)2/3で上界(対数因子まで)であることが示される。
論文参考訳（メタデータ） (2021-03-02T06:53:44Z)
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes [91.38793800392108]
本稿では,マルコフ決定過程(MDP)の遷移確率核が線形混合モデルである線形関数近似による強化学習について検討する。上記の線形混合 MDP に対して$textUCRL-VTR+$ という線形関数近似を用いた計算効率の良い新しいアルゴリズムを提案する。我々の知る限り、これらは線形関数近似を持つRLのための計算効率が良く、ほぼ最小のアルゴリズムである。
論文参考訳（メタデータ） (2020-12-15T18:56:46Z)
Active Local Learning [22.823171570933397]
クエリポイント$x$とラベルなしのトレーニングセット$S$へのアクティブアクセスを与えられた場合、H$でほぼ最適な$hの予測$h(x)$を出力します。特にラベルクエリの数は$H$の複雑さとは無関係でなければならないし、関数$h$は$x$とは無関係に明確に定義されるべきである。これはまた、すぐに距離推定のアルゴリズムを示唆している: ほぼ最適の$hを実際に学習するのに必要となる多くのラベルから$opt(H)$を推定する。
論文参考訳（メタデータ） (2020-08-31T05:39:35Z)
Adapting $k$-means algorithms for outliers [1.9290392443571387]
我々は、$k$-means問題に対して、複数の単純なサンプリングベースのアルゴリズムを、外れ値を持つ設定に適応する方法を示す。我々のアルゴリズムは、目的関数に対して$O(varepsilon)$-approximationを行いながら、$(varepsilon)z$outliersを出力する。
論文参考訳（メタデータ） (2020-07-02T14:14:33Z)
$Q$-learning with Logarithmic Regret [60.24952657636464]
楽観的な$Q$は$mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of state, $A$ is the number of action, $H$ is the planning horizon, $T$ is the total number of steps, $Delta_min$ is the least sub-Optitimality gap。
論文参考訳（メタデータ） (2020-06-16T13:01:33Z)
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
本稿では,決定論的システムにおける関数近似を用いたQ$学習の問題について検討する。もし$delta = Oleft(rho/sqrtdim_Eright)$なら、$Oleft(dim_Eright)$を使って最適なポリシーを見つけることができる。
論文参考訳（メタデータ） (2020-02-17T18:41:49Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。