Fugu-MT 論文翻訳(概要): Stochastic Zeroth order Descent with Structured Directions

論文の概要: Stochastic Zeroth order Descent with Structured Directions

arxiv url: http://arxiv.org/abs/2206.05124v3
Date: Tue, 08 Oct 2024 11:37:51 GMT
ステータス: 翻訳完了
システム内更新日: 2024-12-06 00:10:28.434631
Title: Stochastic Zeroth order Descent with Structured Directions
Title（参考訳）: 構造方向をもつ確率零次輝線
Authors: Marco Rando, Cesare Molinari, Silvia Villa, Lorenzo Rosasco,
Abstract要約: 我々は, 有限差分法であるStructured Zeroth Order Descent (SSZD)を導入・解析し, 集合 $lleq d 方向の勾配を近似し, $d は周囲空間の次元である。凸凸に対して、すべての$c1/2$に対して$O( (d/l) k-c1/2$)$ 上の関数の収束はほぼ確実に証明する。
参考スコア（独自算出の注目度）: 10.604744518360464
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce and analyze Structured Stochastic Zeroth order Descent (S-SZD), a finite difference approach that approximates a stochastic gradient on a set of $l\leq d$ orthogonal directions, where $d$ is the dimension of the ambient space. These directions are randomly chosen and may change at each step. For smooth convex functions we prove almost sure convergence of the iterates and a convergence rate on the function values of the form $O( (d/l) k^{-c})$ for every $c<1/2$, which is arbitrarily close to the one of Stochastic Gradient Descent (SGD) in terms of number of iterations. Our bound shows the benefits of using $l$ multiple directions instead of one. For non-convex functions satisfying the Polyak-{\L}ojasiewicz condition, we establish the first convergence rates for stochastic structured zeroth order algorithms under such an assumption. We corroborate our theoretical findings with numerical simulations where the assumptions are satisfied and on the real-world problem of hyper-parameter optimization in machine learning, achieving competitive practical performance.
Abstract（参考訳）: 有限差分法であるStructured Stochastic Zeroth Order Descent (S-SZD) を導入・解析し、その場合、$d$は周囲空間の次元である、$l\leq d$直交方向の集合上の確率勾配を近似する。これらの方向はランダムに選択され、各ステップで変更される。滑らかな凸函数に対しては、反復の収束と、反復数に関して確率勾配 Descent (SGD) の値に近い任意の$c<1/2$に対して$O( (d/l) k^{-c})$という形の関数値に対する収束率をほぼ確実に証明する。私たちのバウンダリは、$l$の複数の方向を使う利点を示しています。 Polyak-{\L}ojasiewicz条件を満たす非凸関数に対して、そのような仮定の下で確率的構造化されたゼロ階アルゴリズムに対する最初の収束率を確立する。我々は,仮説が満たされる数値シミュレーションと,機械学習におけるハイパーパラメータ最適化の現実問題とを相関させ,競争力のある実用性能を実現する。

関連論文リスト

Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization [2.0657831823662574]
非問題が非問題である場合、一階法のサンプルは問題次元に線形に依存することがあるが、望ましくない問題に対するものである。我々のアルゴリズムは距離を使って複雑さを見積もることができる。 MathO (log d) / EuM4。 DISFOM は $mathO (log d) / EuM4 を用いて分散を鋭くすることができることを示す。
論文参考訳（メタデータ） (2024-06-27T18:38:42Z)
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient [40.22217106270146]
ばらつき低減技術はサンプリングのばらつきを低減し、一階法(FO)とゼロ階法(ZO)の収束率を向上するように設計されている。複合最適化問題において、ZO法は、ランダム推定から導かれる座標ワイド分散と呼ばれる追加の分散に遭遇する。本稿では,ZPDVR法とZPDVR法を提案する。
論文参考訳（メタデータ） (2024-05-28T02:27:53Z)
On Convergence of Incremental Gradient for Non-Convex Smooth Functions [63.51187646914962]
機械学習とネットワーク最適化では、ミスの数と優れたキャッシュを最小化するため、シャッフルSGDのようなアルゴリズムが人気である。本稿では任意のデータ順序付けによる収束特性SGDアルゴリズムについて述べる。
論文参考訳（メタデータ） (2023-05-30T17:47:27Z)
Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization [49.58290066287418]
構成問題の複雑さを軽減するために,MSVR (Multi-block-probe Variance Reduced) という新しい手法を提案する。本研究の結果は, 試料の複雑さの順序や強靭性への依存など, 様々な面で先行して改善された。
論文参考訳（メタデータ） (2022-07-18T12:03:26Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
滑らかな凸凹凸結合型サドル点問題, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$ を考える。漸進的勾配指数(AG-EG)降下指数アルゴリズムについて述べる。
論文参考訳（メタデータ） (2022-06-17T06:10:20Z)
Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization [50.83356836818667]
勾配ランゲヴィン・ダイナミクスは非エプス最適化問題を解くための最も基本的なアルゴリズムの1つである。本稿では、このタイプの2つの変種、すなわち、分散還元ランジュバンダイナミクスと再帰勾配ランジュバンダイナミクスを示す。
論文参考訳（メタデータ） (2022-03-30T11:39:00Z)
A Projection-free Algorithm for Constrained Stochastic Multi-level Composition Optimization [12.096252285460814]
合成最適化のためのプロジェクションフリー条件付き勾配型アルゴリズムを提案する。提案アルゴリズムで要求されるオラクルの数と線形最小化オラクルは,それぞれ$mathcalO_T(epsilon-2)$と$mathcalO_T(epsilon-3)$である。
論文参考訳（メタデータ） (2022-02-09T06:05:38Z)
Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order Coordinate Method [3.860616339202303]
本稿では,$ZOn$局所コスト関数の合計により形成されるグローバルコスト関数を最小化する分散非最適化問題について検討する。エージェントは問題を解くためにzo座標法を近似する。
論文参考訳（メタデータ） (2021-03-24T03:07:46Z)
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems [75.58134963501094]
本稿では,勾配降下(SGD)の軌跡を解析する。我々はSGDが厳格なステップサイズポリシーのために1ドルでサドルポイント/マニフォールドを避けることを示す。
論文参考訳（メタデータ） (2020-06-19T14:11:26Z)
S-ADDOPT: Decentralized stochastic first-order optimization over directed graphs [16.96562173221624]
有向ネットワークノード上に分散する関数のスムーズかつ高コストな関数の和を最小化するために,分散凸最適化を提案する。特に,各ノードに1次オラクルを仮定するtextbftextttS-ADDOPTアルゴリズムを提案する。崩壊するステップサイズ$mathcalO (1/k)$に対して、textbfttS-ADDOPT が$mathcalO (1/k)$ で正解に達し、その収束はネットワーク非依存であることを示す。
論文参考訳（メタデータ） (2020-05-15T21:14:22Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
非滑らかで非滑らかな関数の定常点を見つけるための最初の非漸近解析を提供する。特に、アダマール半微分可能函数(おそらく非滑らか関数の最大のクラス)について研究する。
論文参考訳（メタデータ） (2020-02-10T23:23:04Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。