Fugu-MT 論文翻訳(概要): Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness

論文の概要: Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness

arxiv url: http://arxiv.org/abs/2505.04599v1
Date: Wed, 07 May 2025 17:40:12 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-08 19:07:36.173354
Title: Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness
Title（参考訳）: 非凸確率最適化のための適応勾配アルゴリズムの緩和平滑化条件下での複素度下界
Authors: Michael Crawshaw, Mingrui Liu,
Abstract要約: 最近の非定常最適化の結果は、一般的な適応アルゴリズムの収束を示している。収束の複雑さは、滑らかさ定数のような問題パラメータの点で高次である。
参考スコア（独自算出の注目度）: 14.98493572536424
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent results in non-convex stochastic optimization demonstrate the convergence of popular adaptive algorithms (e.g., AdaGrad) under the $(L_0, L_1)$-smoothness condition, but the rate of convergence is a higher-order polynomial in terms of problem parameters like the smoothness constants. The complexity guaranteed by such algorithms to find an $\epsilon$-stationary point may be significantly larger than the optimal complexity of $\Theta \left( \Delta L \sigma^2 \epsilon^{-4} \right)$ achieved by SGD in the $L$-smooth setting, where $\Delta$ is the initial optimality gap, $\sigma^2$ is the variance of stochastic gradient. However, it is currently not known whether these higher-order dependencies can be tightened. To answer this question, we investigate complexity lower bounds for several adaptive optimization algorithms in the $(L_0, L_1)$-smooth setting, with a focus on the dependence in terms of problem parameters $\Delta, L_0, L_1$. We provide complexity bounds for three variations of AdaGrad, which show at least a quadratic dependence on problem parameters $\Delta, L_0, L_1$. Notably, we show that the decorrelated variant of AdaGrad-Norm requires at least $\Omega \left( \Delta^2 L_1^2 \sigma^2 \epsilon^{-4} \right)$ stochastic gradient queries to find an $\epsilon$-stationary point. We also provide a lower bound for SGD with a broad class of adaptive stepsizes. Our results show that, for certain adaptive algorithms, the $(L_0, L_1)$-smooth setting is fundamentally more difficult than the standard smooth setting, in terms of the initial optimality gap and the smoothness constants.
Abstract（参考訳）: 非凸確率最適化の最近の結果は、(L_0, L_1)$-smoothness条件下での一般的な適応アルゴリズム(例えば、AdaGrad)の収束を示すが、収束率は滑らか性定数のような問題パラメータの高次多項式である。そのようなアルゴリズムが$\epsilon$-定常点を見つけることを保証する複雑さは、$\Theta \left( \Delta L \sigma^2 \epsilon^{-4} \right)$の最適複雑性よりもはるかに大きいかもしれない。しかし、これらの高次の依存関係を締め付けることができるかどうかはまだ分かっていない。この問題に対処するために,数種類の適応最適化アルゴリズムの複雑性の下限を$(L_0, L_1)$-smooth設定で検討し,問題パラメータ$\Delta, L_0, L_1$の依存性に着目した。 AdaGrad の3変量に対する複雑性境界を提供し、この問題パラメータ $\Delta, L_0, L_1$ に少なくとも二次的依存を示す。特に、AdaGrad-Normの非相関な変種は少なくとも$\Omega \left( \Delta^2 L_1^2 \sigma^2 \epsilon^{-4} \right)$ stochastic gradient query to find $\epsilon$-stationary point。また、より適応的なステップサイズを持つSGDの下位境界も提供する。この結果から, 適応アルゴリズムにおいて, $(L_0, L_1)$-smooth 設定は, 初期最適性ギャップと滑らか性定数の観点から, 標準スムーズな設定よりも根本的に困難であることがわかった。

関連論文リスト

Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum [30.01198677588252]
1次アルゴリズムは、$varepsilon-stationary pointを見つけるのに少なくとも$mathcalO(varepsilonepsilon-4)$ complexityを必要とする。本稿では,高効率な変動複雑性を生かした新しい運動量アルゴリズムを提案する。本手法の有効性は実世界のデータセットを用いてロジスティック回帰を用いて検証する。
論文参考訳（メタデータ） (2024-06-18T20:14:52Z)
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization [37.300102993926046]
リプシッツの目的の滑らかな点も凸点も生成しない点の複雑さについて検討する。私たちの分析は単純だが強力だ。 Goldstein-subdifferential set, これは最近の進歩を可能にする。非滑らかな非最適化
論文参考訳（メタデータ） (2023-07-10T11:56:04Z)
An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems [76.2042837251496]
相補的な合成条件に基づく2つの難解なミラー降下アルゴリズムを導入する。注目すべきは、どちらのアルゴリズムも、目的関数のリプシッツ定数や滑らかさに関する事前の知識なしで機能する。本稿では,大規模半確定プログラム上での手法の効率性とロバスト性を示す。
論文参考訳（メタデータ） (2023-06-30T08:34:29Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
本稿では,BFG凸最適化(SCO: Reweighted Query (ReSQue) 推定ツールを提案する。我々はSCOの並列およびプライベート設定における最先端の複雑さを実現するアルゴリズムを開発した。
論文参考訳（メタデータ） (2023-01-01T18:51:29Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
縮退した線形マルコフ+デルタ決定における最適同定問題について, 生成モデルに基づく固定信頼度設定における検討を行った。複雑な非最適化プログラムの解としての下位境界は、そのようなアルゴリズムを考案する出発点として用いられる。
論文参考訳（メタデータ） (2022-08-11T04:12:50Z)
Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity [121.83085611327654]
積 $r:=p + q$, ここで$r$は$mu$-strong convex類似性である。エージェントの通信やローカルコールにマスターされた問題を解決する方法を提案する。提案手法は$mathcalO(sqrtL_q/mu)$法よりもはるかにシャープである。
論文参考訳（メタデータ） (2022-05-30T14:28:02Z)
Optimal Algorithms for Stochastic Multi-Level Compositional Optimization [46.77664277596764]
目的関数が複数の最適でない関数の制限である多段階合成最適化の問題を解く。また,適応型多レベル分散低減法 (SMVR) を用いることで,同じ複雑性を実現するが,実際はより高速に収束する。
論文参考訳（メタデータ） (2022-02-15T16:02:32Z)
A Projection-free Algorithm for Constrained Stochastic Multi-level Composition Optimization [12.096252285460814]
合成最適化のためのプロジェクションフリー条件付き勾配型アルゴリズムを提案する。提案アルゴリズムで要求されるオラクルの数と線形最小化オラクルは,それぞれ$mathcalO_T(epsilon-2)$と$mathcalO_T(epsilon-3)$である。
論文参考訳（メタデータ） (2022-02-09T06:05:38Z)
Oracle Complexity in Nonsmooth Nonconvex Optimization [49.088972349825085]
円滑で有界な$$stationaryポイントを考えると、Oracleベースのメソッドは円滑さの円滑な近似を見つけることができることがよく知られている。本稿では,最適化と平滑化次元とのトレードオフを実証する。
論文参考訳（メタデータ） (2021-04-14T10:42:45Z)
The Complexity of Nonconvex-Strongly-Concave Minimax Optimization [43.07732143522183]
本稿では,非強凹型(nc-sc)滑らかなミニマックス問題の近似定常点を求めるための複雑さを確立する。提案された$Omega-strong$lyconcaveサブ2問題を一般複雑性と平均複雑性の両方で展開する。提案する有限和設定では,提案するアルゴリズムは条件数にほぼ依存している。
論文参考訳（メタデータ） (2021-03-29T18:53:57Z)
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic [142.1492359556374]
双レベル最適化は、2レベル構造を示す問題のクラスである。このような二段階問題に対処するための2段階近似(TTSA)アルゴリズムを提案する。本稿では,TTSAフレームワークの特殊な事例として,2段階の自然なアクター・クリティカルポリシー最適化アルゴリズムが有用であることを示す。
論文参考訳（メタデータ） (2020-07-10T05:20:02Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
本稿では、勾配のないミニマックス最適化問題の大きさを非強設定で表現する。本稿では,新しいゼロ階分散還元降下アルゴリズムが,クエリの複雑さを最もよく表すことを示す。
論文参考訳（メタデータ） (2020-06-16T17:55:46Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。