Fugu-MT 論文翻訳(概要): Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

論文の概要: Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

arxiv url: http://arxiv.org/abs/2305.10633v1
Date: Thu, 18 May 2023 01:10:11 GMT
ステータス: 翻訳完了
システム内更新日: 2023-05-19 17:40:29.587090
Title: Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Title（参考訳）: 景観の平滑化がSGDのシグナルを高める: 単一指標モデル学習のための最適サンプル複雑度
Authors: Alex Damian, Eshaan Nichani, Rong Ge, Jason D. Lee
Abstract要約: 1つのインデックスモデル $sigma(wstar cdot x)$ を、$d$次元の等方的ガウス分布に関して学習するタスクに焦点を当てる。スムーズな損失のオンラインSGDは、$n gtrsim dkstar/2$サンプルで$wstar$を学習する。
参考スコア（独自算出の注目度）: 43.83997656986799
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We focus on the task of learning a single index model $\sigma(w^\star \cdot x)$ with respect to the isotropic Gaussian distribution in $d$ dimensions. Prior work has shown that the sample complexity of learning $w^\star$ is governed by the information exponent $k^\star$ of the link function $\sigma$, which is defined as the index of the first nonzero Hermite coefficient of $\sigma$. Ben Arous et al. (2021) showed that $n \gtrsim d^{k^\star-1}$ samples suffice for learning $w^\star$ and that this is tight for online SGD. However, the CSQ lower bound for gradient based methods only shows that $n \gtrsim d^{k^\star/2}$ samples are necessary. In this work, we close the gap between the upper and lower bounds by showing that online SGD on a smoothed loss learns $w^\star$ with $n \gtrsim d^{k^\star/2}$ samples. We also draw connections to statistical analyses of tensor PCA and to the implicit regularization effects of minibatch SGD on empirical losses.
Abstract（参考訳）: 我々は、d$次元における等方ガウス分布に関して、1つの指数モデル$\sigma(w^\star \cdot x)$を学習するタスクに焦点を当てる。先行研究により、$w^\star$ の学習のサンプル複雑性は、リンク関数 $\sigma$ の情報指数 $k^\star$ によって制御されていることが示され、これは最初の非零ヘルマイト係数 $\sigma$ の指標として定義される。 Ben Arousら (2021) は、$n \gtrsim d^{k^\star-1}$サンプルが$w^\star$を学習するのに十分であることを示した。しかし、勾配に基づく手法のcsq下限は、$n \gtrsim d^{k^\star/2}$ のサンプルのみを示す。本研究では,平滑化損失に対するオンラインsgdが$n \gtrsim d^{k^\star/2}$サンプルで$w^\star$を学習することを示すことにより,上界と下界のギャップを閉じる。また、テンソルPCAの統計的解析と、ミニバッチSGDの暗黙の正規化効果を経験的損失に関連付ける。

関連論文リスト

Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models [37.42736399673992]
シングルインデックスモデル (SIM) は $sigma(mathbfwast cdot mathbfx)$ という形式の関数であり、$sigma: mathbbR to mathbbR$ は既知のリンク関数であり、$mathbfwast$ は隠れ単位ベクトルである。適切な学習者が$L2$-error of $O(mathrmOPT)+epsilon$。
論文参考訳（メタデータ） (2024-11-08T17:10:38Z)
Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
以前の$mathRd上の分布に関する民間推定者は、次元性の呪いに苦しむ。本稿では,サンプルの複雑さが次元依存性を改善したアルゴリズムを提案する。
論文参考訳（メタデータ） (2024-11-01T17:59:53Z)
Sum-of-squares lower bounds for Non-Gaussian Component Analysis [33.80749804695003]
非ガウス成分分析(Non-Gaussian Component Analysis、NGCA)は、高次元データセットにおいて非ガウス方向を求める統計的タスクである。本稿では Sum-of-Squares フレームワークにおける NGCA の複雑さについて考察する。
論文参考訳（メタデータ） (2024-10-28T18:19:13Z)
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis [45.05072391903122]
情報指数は、オンライン勾配降下のサンプルの複雑さを予測する上で重要な役割を果たす。マルチインデックスモデルでは、最低度のみに焦点を合わせることで、重要な構造の詳細を見逃すことができる。 2次項と高次項の両方を考慮することで、まず2次項から関連する空間を学習できることが示される。
論文参考訳（メタデータ） (2024-10-13T00:14:08Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
単一インデックス対象関数 $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ の勾配勾配勾配学習問題について検討する。 SGDに基づくアルゴリズムにより最適化された2層ニューラルネットワークは、情報指数に支配されない複雑さで$f_*$を学習する。
論文参考訳（メタデータ） (2024-06-03T17:56:58Z)
Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
ランダムグラフにおける植込み高密度部分グラフの検出は、基本的な統計的および計算上の問題である。我々は、$Gr(n, n-beta)ハイパーグラフにおいて、植えた$Gr(ngamma, n-alpha)$ subhypergraphの存在を検出することを検討する。平均値の減少に基づく硬さが不明な微妙な対数密度構造を考えると,この結果はグラフの場合$r=2$で既に新しくなっている。
論文参考訳（メタデータ） (2023-04-17T10:38:08Z)
Structure Learning in Graphical Models from Indirect Observations [17.521712510832558]
本稿では、パラメータ法と非パラメトリック法の両方を用いて、Rp$における$p$次元ランダムベクトル$Xのグラフィカル構造を学習する。温和な条件下では、グラフ構造推定器が正しい構造を得ることができることを示す。
論文参考訳（メタデータ） (2022-05-06T19:24:44Z)
On the Power of Multitask Representation Learning in Linear MDP [61.58929164172968]
本稿では,線形マルコフ決定過程(MDP)におけるマルチタスク表現学習の統計的メリットについて分析する。簡単な最小二乗アルゴリズムが $tildeO(H2sqrtfrackappa MathcalC(Phi)2 kappa dNT+frackappa dn) というポリシーを学ぶことを証明した。
論文参考訳（メタデータ） (2021-06-15T11:21:06Z)
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model [3.749193647980305]
本稿では,一連の状態対応機能を有するマルコフ決定プロセス(MDP)について考察する。モデルに基づくアプローチ(resp.$Q-learning)が、高い確率で$varepsilon$-Optimalポリシーを確実に学習することを示す。
論文参考訳（メタデータ） (2021-05-28T17:49:39Z)
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity [59.34067736545355]
S$状態、$A$アクション、割引係数$gamma in (0,1)$、近似しきい値$epsilon > 0$の MDP が与えられた場合、$epsilon$-Optimal Policy を学ぶためのモデルなしアルゴリズムを提供する。十分小さな$epsilon$の場合、サンプルの複雑さで改良されたアルゴリズムを示す。
論文参考訳（メタデータ） (2020-06-06T13:34:41Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
非同期Q-ラーニングはマルコフ決定過程(MDP)の最適行動値関数(またはQ-関数)を学習することを目的としている。 Q-関数の入出力$varepsilon$-正確な推定に必要なサンプルの数は、少なくとも$frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$の順である。
論文参考訳（メタデータ） (2020-06-04T17:51:00Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。