Fugu-MT 論文翻訳(概要): A Retrospective Approximation Approach for Smooth Stochastic Optimization

論文の概要: A Retrospective Approximation Approach for Smooth Stochastic Optimization

arxiv url: http://arxiv.org/abs/2103.04392v3
Date: Wed, 6 Mar 2024 23:17:04 GMT
ステータス: 翻訳完了
システム内更新日: 2024-03-08 18:41:34.504113
Title: A Retrospective Approximation Approach for Smooth Stochastic Optimization
Title（参考訳）: Smooth Stochastic Optimizationに対する反省的近似法
Authors: David Newton, Raghu Bollapragada, Raghu Pasupathy, Nung Kwan Yip
Abstract要約: グラディエント(グラディエント、英: Gradient、SG)とは、最適化(SO)問題をスムーズ(ノンフィクション)な目標値で解くための補足的反復手法である。
参考スコア（独自算出の注目度）: 0.2867517731896504
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stochastic Gradient (SG) is the defacto iterative technique to solve stochastic optimization (SO) problems with a smooth (non-convex) objective $f$ and a stochastic first-order oracle. SG's attractiveness is due in part to its simplicity of executing a single step along the negative subsampled gradient direction to update the incumbent iterate. In this paper, we question SG's choice of executing a single step as opposed to multiple steps between subsample updates. Our investigation leads naturally to generalizing SG into Retrospective Approximation (RA) where, during each iteration, a "deterministic solver" executes possibly multiple steps on a subsampled deterministic problem and stops when further solving is deemed unnecessary from the standpoint of statistical efficiency. RA thus rigorizes what is appealing for implementation -- during each iteration, "plug in" a solver, e.g., L-BFGS line search or Newton-CG, as is, and solve only to the extent necessary. We develop a complete theory using relative error of the observed gradients as the principal object, demonstrating that almost sure and $L_1$ consistency of RA are preserved under especially weak conditions when sample sizes are increased at appropriate rates. We also characterize the iteration and oracle complexity (for linear and sub-linear solvers) of RA, and identify a practical termination criterion leading to optimal complexity rates. To subsume non-convex $f$, we present a certain "random central limit theorem" that incorporates the effect of curvature across all first-order critical points, demonstrating that the asymptotic behavior is described by a certain mixture of normals. The message from our numerical experiments is that the ability of RA to incorporate existing second-order deterministic solvers in a strategic manner might be important from the standpoint of dispensing with hyper-parameter tuning.
Abstract（参考訳）: 確率勾配(Stochastic Gradient, SG)は、確率的最適化(SO)問題を滑らか(非凸)な目標$f$と確率的一階オラクルで解くためのデファクト反復手法である。 SGの魅力は、既存の繰り返しを更新するために、負のサブサンプル勾配方向に沿って単一のステップを実行することの単純さにある。本稿では,サブサンプル更新間の複数のステップに対して,SGが単一ステップを実行するという選択を疑問視する。本研究は, sg を回帰近似 (ra) に一般化し, 各反復の間, サブサンプリング決定論的問題に対して複数のステップを実行し, さらなる解法を統計的効率の観点から不要と考えると停止する。したがって、RAは実装にアピールするものを厳格化します -- 各イテレーションの間、L-BFGSラインサーチやNewton-CGのようなソルバを「プラグイン」し、必要な範囲でのみ解決します。観測された勾配の相対誤差を主対象とする完全理論を開発し、サンプルサイズが適切な速度で増加すると、RAのほぼ確実かつ$L_1$一貫性が特に弱い条件下で維持されることを示した。また、RAの繰り返しとオラクルの複雑性(線形および線形の解法)を特徴付けるとともに、最適複雑性率につながる実用的な終了基準を同定する。非凸 $f$ を仮定するために、一階の臨界点全体の曲率の効果を組み込んだある種の「ランダム中心極限定理」を示し、漸近的挙動が正規の混合によって記述されることを示す。数値実験から得られたメッセージは、RAが既存の2階決定論的解法を戦略的に組み込む能力は、超パラメータチューニングを伴わない点から重要である。

関連論文リスト

A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression [22.917692982875025]
一階微分しか持たない関数を$f$で扱える新しいLyapunov関数を導入する。一般の減少段数と定数段数の下で有限時間モーメント境界を導出する。我々の結果は、特にオンライン統計手法に広く応用されている。
論文参考訳（メタデータ） (2025-04-11T00:20:37Z)
New Lower Bounds for Stochastic Non-Convex Optimization through Divergence Composition [11.530542389959347]
L-凸性(QC)、擬似成長(SmoothQG)、制限不等式(RSI)などの非次元設定における一階最適化の基本的限界を示す。
論文参考訳（メタデータ） (2025-02-19T19:21:00Z)
A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
そこで本研究では,未知の信号の復元を課題とする,ロバストな位相探索問題を提案する。提案するオラクルは、単純な勾配ステップと外れ値を用いて、計算学的スペクトル降下を回避している。
論文参考訳（メタデータ） (2024-09-07T06:37:23Z)
Accelerated stochastic approximation with state-dependent noise [7.4648480208501455]
勾配観測における2次雑音に対する一般仮定の下での滑らかな凸最適化問題を考察する。このような問題は、統計学におけるよく知られた一般化された線形回帰問題において、様々な応用において自然に発生する。 SAGDとSGEは、適切な条件下で、最適収束率を達成することを示す。
論文参考訳（メタデータ） (2023-07-04T06:06:10Z)
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching [55.28394191394675]
等式制約付き非線形非IBS最適化問題に対する適応的不正確なニュートン法を開発した。ベンチマーク非線形問題,LVMのデータによる制約付きロジスティック回帰,PDE制約問題において,本手法の優れた性能を示す。
論文参考訳（メタデータ） (2023-05-28T06:33:37Z)
Stochastic Mirror Descent for Large-Scale Sparse Recovery [13.500750042707407]
本稿では,2次近似の高次元スパースパラメータの統計的推定への応用について論じる。提案アルゴリズムは, 回帰器分布の弱い仮定の下で, 推定誤差の最適収束を実現する。
論文参考訳（メタデータ） (2022-10-23T23:23:23Z)
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging [96.13485146617322]
本稿では, ステップサイズが一定であるSEG法の解析を行い, 良好な収束をもたらす手法のバリエーションを示す。平均化で拡張した場合、SEGはナッシュ平衡に確実に収束し、スケジュールされた再起動手順を組み込むことで、その速度が確実に加速されることを証明した。
論文参考訳（メタデータ） (2021-06-30T17:51:36Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
アルゴリズムが高い確率で小さな客観的残差を与えることを理論的に保証することが不可欠である。非滑らか凸最適化の既存の方法は、信頼度に依存した複雑性境界を持つ。そこで我々は,勾配クリッピングを伴う2つの手法に対して,新たなステップサイズルールを提案する。
論文参考訳（メタデータ） (2021-06-10T17:54:21Z)
Stochastic Compositional Gradient Descent under Compositional constraints [13.170519806372075]
目的関数と制約関数が凸であり,関数の合成として表される制約最適化問題について検討する。この問題は、公平な分類/回帰とキューシステムの設計に生じる。提案手法は最適かつ実現可能な解をほぼ確実に見つけることが保証されている。
論文参考訳（メタデータ） (2020-12-17T05:38:37Z)
Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance [0.0]
本稿では,重球法,ネステロフ加速勾配法(S-NAG),広く使用されているアダム法など,勾配勾配勾配のいくつかの変種を統一する一般最適化手法について検討する。この回避は、非自明な常微分方程式のノイズ離散化として研究される。
論文参考訳（メタデータ） (2020-12-07T19:14:49Z)
ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
第一次アルゴリズムを用いて,厳密な凸と滑らかな非制約最適化問題の解法について検討する。我々は,過去の勾配を平均化し,実装が容易な小説「Recursive One-Over-T SGD」を考案した。有限サンプル, 漸近感覚, 感覚の両面において, 最先端の性能を同時に達成できることを実証する。
論文参考訳（メタデータ） (2020-08-28T14:46:56Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
本研究は,ステップサイズの減衰が正確な収束に必要であるという事実と,一定のステップサイズがエラーまでの時間でより速く学習するという事実のバランスをとることを目的とする。ステップサイズのミニバッチを最初から修正するのではなく,パラメータを適応的に進化させることを提案する。
論文参考訳（メタデータ） (2020-07-02T16:02:02Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。