Fugu-MT 論文翻訳(概要): High Probability Guarantees for Random Reshuffling

論文の概要: High Probability Guarantees for Random Reshuffling

arxiv url: http://arxiv.org/abs/2311.11841v1
Date: Mon, 20 Nov 2023 15:17:20 GMT
ステータス: 翻訳完了
システム内更新日: 2023-11-21 18:13:20.794940
Title: High Probability Guarantees for Random Reshuffling
Title（参考訳）: ランダムリシャッフルのための高い確率保証
Authors: Hengxu Yu, Xiao Li
Abstract要約: 非行列最適化問題に対処するために、ランダムリシャッフル(mathsfRR$)の勾配法を検討する。本研究ではまず,$mathsfRR$sサンプリング手順におけるニューラルネットワークの複雑さについて検討する。そこで我々は,乱数化摂動手順の定常点を含むランダムリシャッフル法(mathsfp$mathsfRR$)を設計する。
参考スコア（独自算出の注目度）: 5.663909018247509
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the stochastic gradient method with random reshuffling ($\mathsf{RR}$) for tackling smooth nonconvex optimization problems. $\mathsf{RR}$ finds broad applications in practice, notably in training neural networks. In this work, we first investigate the concentration property of $\mathsf{RR}$'s sampling procedure and establish a new high probability sample complexity guarantee for driving the gradient (without expectation) below $\varepsilon$, which effectively characterizes the efficiency of a single $\mathsf{RR}$ execution. Our derived complexity matches the best existing in-expectation one up to a logarithmic term while imposing no additional assumptions nor changing $\mathsf{RR}$'s updating rule. Furthermore, by leveraging our derived high probability descent property and bound on the stochastic error, we propose a simple and computable stopping criterion for $\mathsf{RR}$ (denoted as $\mathsf{RR}$-$\mathsf{sc}$). This criterion is guaranteed to be triggered after a finite number of iterations, and then $\mathsf{RR}$-$\mathsf{sc}$ returns an iterate with its gradient below $\varepsilon$ with high probability. Moreover, building on the proposed stopping criterion, we design a perturbed random reshuffling method ($\mathsf{p}$-$\mathsf{RR}$) that involves an additional randomized perturbation procedure near stationary points. We derive that $\mathsf{p}$-$\mathsf{RR}$ provably escapes strict saddle points and efficiently returns a second-order stationary point with high probability, without making any sub-Gaussian tail-type assumptions on the stochastic gradient errors. Finally, we conduct numerical experiments on neural network training to support our theoretical findings.
Abstract（参考訳）: 滑らかな非凸最適化問題に対処するために,ランダムリシャッフル(\mathsf{RR}$)を用いた確率勾配法を考える。 $\mathsf{rr}$は、ニューラルネットワークのトレーニングにおいて、実際に広く応用されている。本研究はまず,$\mathsf{RR}$のサンプリング手順の濃度特性を調査し,$\varepsilon$以下で勾配を駆動する(期待せずに)新しい高確率サンプル複雑性を保証し,単一の$\mathsf{RR}$の実行効率を効果的に特徴づける。我々の導出した複雑性は、対数項に最も近い既存の不変項と一致するが、追加の仮定や$\mathsf{RR}$の更新規則の変更は含まない。さらに、得られた高確率降下特性を活用し、確率誤差に縛られることにより、$\mathsf{RR}$($\mathsf{RR}$-$\mathsf{sc}$)の単純で計算可能な停止基準を提案する。この基準は有限反復の後にトリガーされることが保証され、次に$\mathsf{RR}$-$\mathsf{sc}$はその勾配が$\varepsilon$より高い確率でイテレートを返す。さらに,提案する停止基準に基づいて,静止点近傍で追加のランダムな摂動手続きを伴う摂動乱数リシャッフリング法(\mathsf{p}$-$\mathsf{rr}$)を設計する。我々は、$\mathsf{p}$-$\mathsf{rr}$ が厳密な鞍点を回避し、確率的勾配誤差のサブガウス的テール型仮定をすることなく、高確率で二階定常点を効率的に返すことを導出する。最後に,ニューラルネットワークトレーニングに関する数値実験を行い,理論的な知見を裏付ける。

関連論文リスト

Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping [21.865728815935665]
重み付き雑音下での最初の収束を提供するが、切断はしない。また、テールインデックス$mathfrakp$が事前に不明な場合には、最初の$mathcalO(Tfrac1-mathfrakp3mathfrakp-2)$収束率も設定する。
論文参考訳（メタデータ） (2024-12-27T08:46:46Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
近年の研究では、再生カーネルヒルベルト空間(RKHS)がニューラルネットワークによる関数のモデル化に適した空間ではないことが示されている。本稿では,有界ノルムを持つオーバーパラメータ化された2層ニューラルネットワークに適した関数空間について検討する。
論文参考訳（メタデータ） (2024-04-29T15:04:07Z)
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems [6.431793114484429]
多面体設定における微分プライベートなサドル点の問題を解くために、$sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$を提案する。我々のアルゴリズムは、一定の成功率で$sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$に達することを示す。
論文参考訳（メタデータ） (2024-03-05T12:28:00Z)
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization [6.314057999212246]
ランダムリシャッフル技術は、ニューラルネットワークのような大規模アプリケーションで使用される。本稿では,ノルムPRRが生成するランダムリシャッフル型反復が線形設定に収束することを示す。最後に,提案手法に適用可能な最終収束率を導出する。
論文参考訳（メタデータ） (2023-12-02T07:12:00Z)
Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging [8.990961435218544]
本稿では,対数障壁を用いたB$-sample双対平均化法を提案する。 Poisson逆問題に対して、我々のアルゴリズムは$smashtildeO(d3/varepsilon2)$ timeで$varepsilon$解を得る。量子状態トモグラフィーの最大線量推定を計算するとき、我々のアルゴリズムは $smashtildeO(d3/varepsilon2)$ time で $varepsilon$-optimal Solution を得る。
論文参考訳（メタデータ） (2023-11-05T03:33:44Z)
Near Optimal Heteroscedastic Regression with Symbiotic Learning [29.16456701187538]
我々は不連続線形回帰の問題を考察する。正則ノルムにおいて$mathbfw*$を$tildeOleft(|mathbff*|2cdot left(frac1n + left(dnright)2right)$の誤差まで推定し、一致する下界を証明できる。
論文参考訳（メタデータ） (2023-06-25T16:32:00Z)
Convergence of Alternating Gradient Descent for Matrix Factorization [5.439020425819001]
非対称行列分解対象に一定のステップサイズを施した交互勾配降下(AGD)について検討した。階数-r$行列 $mathbfA in mathbbRm times n$, smoothness $C$ in the complexity $T$ to be a absolute constant。
論文参考訳（メタデータ） (2023-05-11T16:07:47Z)
Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals [43.0867217287089]
ガウス分布の下で半空間を不可知的に学習する作業について検討する。この課題に対して,ほぼ最適計算硬度を証明した。
論文参考訳（メタデータ） (2023-02-13T16:46:23Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
滑らかな凸凹凸結合型サドル点問題, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$ を考える。漸進的勾配指数(AG-EG)降下指数アルゴリズムについて述べる。
論文参考訳（メタデータ） (2022-06-17T06:10:20Z)
On Submodular Contextual Bandits [92.45432756301231]
作用が基底集合の部分集合であり、平均報酬が未知の単調部分モジュラ函数によってモデル化されるような文脈的包帯の問題を考える。 Inverse Gap Weighting 戦略により,提案アルゴリズムは推定関数の局所的最適度を効率よくランダム化することを示す。
論文参考訳（メタデータ） (2021-12-03T21:42:33Z)
Randomized Exploration for Reinforcement Learning with General Value Function Approximation [122.70803181751135]
本稿では,ランダム化最小二乗値反復(RLSVI)アルゴリズムに着想を得たモデルレス強化学習アルゴリズムを提案する。提案アルゴリズムは,スカラーノイズを用いたトレーニングデータを簡易に摂動させることにより,探索を促進する。我々はこの理論を、既知の困難な探査課題にまたがる実証的な評価で補完する。
論文参考訳（メタデータ） (2021-06-15T02:23:07Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
学習者が生成モデル$Y = langle X,w* rangle + epsilon$から$n$のサンプルにアクセスできるような高次元頑健な線形回帰問題について検討する。 i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance, (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
論文参考訳（メタデータ） (2020-07-16T06:44:44Z)
Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations [54.42518331209581]
私たちは発見するアルゴリズムを見つけます。 epsilon$-approximate stationary point ($|nabla F(x)|le epsilon$) using $(epsilon,gamma)$surimateランダムランダムポイント。ここでの私たちの下限は、ノイズのないケースでも新規です。
論文参考訳（メタデータ） (2020-06-24T04:41:43Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
期待される正方形損失から、最も適合した単一ニューロンを学習することの問題点を考察する。 ReLUアクティベーションでは、我々の人口リスク保証は$O(mathsfOPT1/2)+epsilon$である。 ReLUアクティベーションでは、我々の人口リスク保証は$O(mathsfOPT1/2)+epsilon$である。
論文参考訳（メタデータ） (2020-05-29T07:20:35Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
非滑らかで非滑らかな関数の定常点を見つけるための最初の非漸近解析を提供する。特に、アダマール半微分可能函数(おそらく非滑らか関数の最大のクラス)について研究する。
論文参考訳（メタデータ） (2020-02-10T23:23:04Z)
Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems [36.645753881826955]
本稿では,分散を利用してより効率的に推定できるRecurEnti Ascent(SREDA)という新しい手法を提案する。この方法はこの問題でよく知られている。
論文参考訳（メタデータ） (2020-01-11T09:05:03Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。