Fugu-MT 論文翻訳(概要): Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity

論文の概要: Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity

arxiv url: http://arxiv.org/abs/2201.11411v1
Date: Thu, 27 Jan 2022 10:04:04 GMT
ステータス: 翻訳完了
システム内更新日: 2022-01-28 21:38:55.727333
Title: Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(\epsilon^{-7/4})$ Complexity
Title（参考訳）: 再開された非凸型加速勾配降下:$o(\epsilon^{-7/4})$複雑性の多対数因子
Authors: Huan Li and Zhouchen Lin
Abstract要約: 単純な加速勾配降下(AGD)は、$O(epsilon-7/4)$グラデーション計算において、$epsilon$-approximate 1次定常点を求める。我々の複雑性はいかなる多相因子も隠さないので、$Ologfrac1epsilon)$ factorによって最先端の要素を改善する。
参考スコア（独自算出の注目度）: 70.65867695317633
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper studies the accelerated gradient descent for general nonconvex problems under the gradient Lipschitz and Hessian Lipschitz assumptions. We establish that a simple restarted accelerated gradient descent (AGD) finds an $\epsilon$-approximate first-order stationary point in $O(\epsilon^{-7/4})$ gradient computations with simple proofs. Our complexity does not hide any polylogarithmic factors, and thus it improves over the state-of-the-art one by the $O(\log\frac{1}{\epsilon})$ factor. Our simple algorithm only consists of Nesterov's classical AGD and a restart mechanism, and it does not need the negative curvature exploitation or the optimization of regularized surrogate functions. Technically, our simple proof does not invoke the analysis for the strongly convex AGD, which is crucial to remove the $O(\log\frac{1}{\epsilon})$ factor.
Abstract（参考訳）: 本稿では、勾配リプシッツおよびヘッセンリプシッツ仮定の下での一般非凸問題に対する加速勾配勾配について検討する。単純な再スタート型加速勾配降下 (agd) は、簡単な証明で$o(\epsilon^{-7/4})$勾配計算において、$\epsilon$-approximate 1-order stationary pointを求める。複雑性は多対数因子を隠蔽しないので、o(\log\frac{1}{\epsilon})$ファクタによって最先端の因子よりも改善します。我々の単純なアルゴリズムは、ネステロフの古典的なagdと再起動機構のみで構成されており、正則化サーロゲート関数の負の曲率利用や最適化は不要である。技術的には、我々の単純な証明は、$O(\log\frac{1}{\epsilon})$ factor を除去することが不可欠である強い凸 AGD の解析を引き起こさない。

関連論文リスト

Improved Complexity for Smooth Nonconvex Optimization: A Two-Level Online Learning Approach with Quasi-Newton Methods [18.47705532817026]
本研究では,スムーズな関数を持つ$epsilon$firstorder stationary point (FOSP) を求める問題について検討する。本稿では,このオンライン学習問題を解決するために,新しい楽観的な準勾配近似法を提案する。
論文参考訳（メタデータ） (2024-12-03T05:20:05Z)
Generalization Bounds for Gradient Methods via Discrete and Continuous Prior [8.76346911214414]
次数$O(frac1n + fracL2nsum_t=1T(gamma_t/varepsilon_t)2)$の新たな高確率一般化境界を示す。また、あるSGDの変種に対する新しい境界を得ることもできる。
論文参考訳（メタデータ） (2022-05-27T07:23:01Z)
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally! [6.170898159041277]
ProxSkipは、スムーズな(f$)関数と高価な非滑らかな(psi$)関数の和を最小化する驚くほどシンプルで、証明可能な効率のよい方法である。我々の主な動機は、勾配演算子の評価が、すべてのデバイスで独立して局所的なGDステップを取ることに対応する、連合学習にある。
論文参考訳（メタデータ） (2022-02-18T18:56:05Z)
Escape saddle points by a simple gradient-descent based algorithm [6.7298812735467095]
非最適化において、サドルポイントのエスケープは中心的な研究トピックである。滑らかな関数 $fcolonmathbbRntomathbbR$ に対して、$epsilonO(log n)2/epsilon4)$ を出力する。技術的に、我々の主な貢献は勾配のみを用いたロバストなヘッセンパワー手法の実装である。
論文参考訳（メタデータ） (2021-11-28T07:32:54Z)
Faster Perturbed Stochastic Gradient Methods for Finding Local Minima [92.99933928528797]
局所最小値を求めるための高速な摂動勾配フレームワークであるtttPullbackを提案する。 SARAH/SP や STORM のような勾配推定器を用いたプルバックは $(epsilon, epsilon_H)$approximate local minima を $tilde O(epsilon-3 + H-6)$ 内で見つけることができる。我々のフレームワークの中核となる考え方は、勾配評価の平均運動を制御するステップサイズのプルバック方式である。
論文参考訳（メタデータ） (2021-10-25T07:20:05Z)
Correcting Momentum with Second-order Information [50.992629498861724]
最適積に$O(epsilon)$epsilon点を求める非臨界最適化のための新しいアルゴリズムを開発した。我々は、さまざまな大規模ディープラーニングベンチマークとアーキテクチャで結果を検証する。
論文参考訳（メタデータ） (2021-03-04T19:01:20Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
本稿では、勾配のないミニマックス最適化問題の大きさを非強設定で表現する。本稿では,新しいゼロ階分散還元降下アルゴリズムが,クエリの複雑さを最もよく表すことを示す。
論文参考訳（メタデータ） (2020-06-16T17:55:46Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
非滑らかで非滑らかな関数の定常点を見つけるための最初の非漸近解析を提供する。特に、アダマール半微分可能函数(おそらく非滑らか関数の最大のクラス)について研究する。
論文参考訳（メタデータ） (2020-02-10T23:23:04Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
適応アルゴリズムは勾配の歴史を用いて勾配を更新し、深層ニューラルネットワークのトレーニングにおいてユビキタスである。本稿では,非コンケーブ最小値問題に対するOptimisticOAアルゴリズムの変種を解析する。実験の結果,適応型GAN非適応勾配アルゴリズムは経験的に観測可能であることがわかった。
論文参考訳（メタデータ） (2019-12-26T22:10:10Z)
Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond [25.845034951660544]
我々は,一階述語法では改善できないことを示した。我々は、$$quasar のアログ関数を計算する変種加速降下を開発する。ファーストオーダーの方法では改善できません。
論文参考訳（メタデータ） (2019-06-27T22:39:35Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。