Fugu-MT 論文翻訳(概要): Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods

論文の概要: Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods

arxiv url: http://arxiv.org/abs/2312.08531v2
Date: Tue, 12 Mar 2024 03:53:46 GMT
ステータス: 翻訳完了
システム内更新日: 2024-03-14 01:03:09.376709
Title: Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods
Title（参考訳）: 確率勾配法の最終Iterate Convergenceの再検討
Authors: Zijian Liu, Zhengyuan Zhou
Abstract要約: グラディエント・Descent(SGD)アルゴリズムは、実際の性能が良く、理論的な理解が欠如していることから、人々の関心を喚起している。有限収束がより広い合成最適化や非ユークリッドノルムに証明可能な拡張が可能かどうかはまだ不明である。
参考スコア（独自算出の注目度）: 25.831462008050387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the past several years, the last-iterate convergence of the Stochastic Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding. For Lipschitz convex functions, different works have established the optimal $O(\log(1/\delta)\log T/\sqrt{T})$ or $O(\sqrt{\log(1/\delta)/T})$ high-probability convergence rates for the final iterate, where $T$ is the time horizon and $\delta$ is the failure probability. However, to prove these bounds, all the existing works are either limited to compact domains or require almost surely bounded noises. It is natural to ask whether the last iterate of SGD can still guarantee the optimal convergence rate but without these two restrictive assumptions. Besides this important question, there are still lots of theoretical problems lacking an answer. For example, compared with the last-iterate convergence of SGD for non-smooth problems, only few results for smooth optimization have yet been developed. Additionally, the existing results are all limited to a non-composite objective and the standard Euclidean norm. It still remains unclear whether the last-iterate convergence can be provably extended to wider composite optimization and non-Euclidean norms. In this work, to address the issues mentioned above, we revisit the last-iterate convergence of stochastic gradient methods and provide the first unified way to prove the convergence rates both in expectation and in high probability to accommodate general domains, composite objectives, non-Euclidean norms, Lipschitz conditions, smoothness, and (strong) convexity simultaneously. Additionally, we extend our analysis to obtain the last-iterate convergence under heavy-tailed noises.
Abstract（参考訳）: 過去数年間、SGD(Stochastic Gradient Descent)アルゴリズムの最後の定点収束は、その実践上の優れたパフォーマンスと理論的理解の欠如から人々の関心を惹き付けてきた。リプシッツ凸函数に対しては、最適な$O(\log(1/\delta)\log T/\sqrt{T})$または$O(\sqrt{\log(1/\delta)/T})$最終イテレートに対する高確率収束率を確立している。しかし、これらの境界を証明するために、既存のすべての作品はコンパクト領域に限定されるか、ほぼ確実に有界ノイズを必要とする。最後の反復 SGD が最適収束率を保証できるかどうかを問うことは自然であるが、これら2つの制限的な仮定が存在しない。この重要な質問に加えて、答えが欠けている理論的な問題がまだたくさんある。例えば、非滑らかな問題に対するSGDの最終定値収束と比較して、スムーズな最適化の結果はまだ少ない。さらに、既存の結果は、すべて非合成目的と標準ユークリッドノルムに制限されている。ラストイテレート収束がより広い合成最適化と非ユークリッドノルムに拡張できるかどうかはまだ不明である。本稿では,上記の問題に対処するために,確率勾配法のラストイテレート収束を再検討し,一般領域,複合目的,非ユークリッドノルム,リプシッツ条件,滑らかさ,(強い)凸性に対応するための期待値と高い確率の両方において収束率を同時に証明する最初の統一的手法を提供する。さらに,重み付き雑音下でのラストイテレート収束を得るために解析を拡張した。

関連論文リスト

Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
構造密度の重み付き雑音によるクリップ最適化問題を考察する。勾配が有限の順序モーメントを持つとき、$mathcalO(K-(alpha - 1)/alpha)$よりも高速な収束率が得られることを示す。得られた推定値が無視可能なバイアスと制御可能な分散を持つことを示す。
論文参考訳（メタデータ） (2023-11-07T17:39:17Z)
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation [22.758674468435302]
重尾雑音系では、勾配と真の速度の差は有限の$p-thモーメントを持つと仮定される。本稿では,重み付き雑音を用いた非平滑凸最適化の包括的解析を行う。
論文参考訳（メタデータ） (2023-03-22T03:05:28Z)
High Probability Convergence of Stochastic Gradient Methods [15.829413808059124]
最適解への初期距離に依存する有界収束を示す。 AdaGrad-Normのハイバウンドが得られることを示す。
論文参考訳（メタデータ） (2023-02-28T18:42:11Z)
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance [59.211456992422136]
制約の少ない仮定の下で高確率収束結果のアルゴリズムを提案する。これらの結果は、標準機能クラスに適合しない問題を最適化するために検討された手法の使用を正当化する。
論文参考訳（メタデータ） (2023-02-02T10:37:23Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
滑らかな凸凹凸結合型サドル点問題, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$ を考える。漸進的勾配指数(AG-EG)降下指数アルゴリズムについて述べる。
論文参考訳（メタデータ） (2022-06-17T06:10:20Z)
The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence [2.512827436728378]
Gradient Descent (SGD) は最適化において最も単純で一般的な手法の一つである。定次元設定におけるSGDの最終点収束を特徴付ける方法を示す。
論文参考訳（メタデータ） (2021-06-28T11:51:04Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
我々は、勾配推定が末尾を持つ可能性のある一階アルゴリズムを用いたヒルベルト非最適化を考える。本研究では, 勾配, 運動量, 正規化勾配勾配の収束を高確率臨界点に収束させることと, 円滑な損失に対する最もよく知られた繰り返しを示す。
論文参考訳（メタデータ） (2021-06-28T00:17:01Z)
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance [14.06947898164194]
ヘビーテールは様々なシナリオで勾配降下 (sgd) で現れる。 SGDの収束保証は、潜在的に無限のばらつきを持つ状態依存性および重尾ノイズ下で提供します。その結果,SGDは無限に分散した重尾雑音下であっても,地球最適値に収束できることが示された。
論文参考訳（メタデータ） (2021-02-20T13:45:11Z)
Last iterate convergence of SGD for Least-Squares in the Interpolation regime [19.05750582096579]
基本最小二乗構成におけるノイズレスモデルについて検討する。最適予測器が完全に入力に適合すると仮定し、$langletheta_*, phi(X) rangle = Y$, ここで$phi(X)$は無限次元の非線型特徴写像を表す。
論文参考訳（メタデータ） (2021-02-05T14:02:20Z)
ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
第一次アルゴリズムを用いて,厳密な凸と滑らかな非制約最適化問題の解法について検討する。我々は,過去の勾配を平均化し,実装が容易な小説「Recursive One-Over-T SGD」を考案した。有限サンプル, 漸近感覚, 感覚の両面において, 最先端の性能を同時に達成できることを実証する。
論文参考訳（メタデータ） (2020-08-28T14:46:56Z)
A Simple Convergence Proof of Adam and Adagrad [74.24716715922759]
我々はAdam Adagradと$O(d(N)/st)$アルゴリズムの収束の証明を示す。 Adamはデフォルトパラメータで使用する場合と同じ収束$O(d(N)/st)$で収束する。
論文参考訳（メタデータ） (2020-03-05T01:56:17Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。