Fugu-MT 論文翻訳(概要): On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

論文の概要: On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

arxiv url: http://arxiv.org/abs/2109.10933v1
Date: Wed, 22 Sep 2021 18:01:15 GMT
ステータス: 翻訳完了
システム内更新日: 2021-09-24 15:11:18.509839
Title: On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods
Title（参考訳）: 確率勾配降下法における適応バッチサイズ選択戦略の等価性について
Authors: Luis Espath, Sebastian Krumscheid, Ra\'ul Tempone, Pedro Vilanova
Abstract要約: 本研究では, 標準検定と内積/直交検定は, グラディエント・Descent(SGD)法に付随する収束率の点で等価であることを示す。また、内部積/直交性テストは、ベストケースシナリオにおける通常のテストと同じくらい安価であることを示す。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this study, we demonstrate that the norm test and inner product/orthogonality test presented in \cite{Bol18} are equivalent in terms of the convergence rates associated with Stochastic Gradient Descent (SGD) methods if $\epsilon^2=\theta^2+\nu^2$ with specific choices of $\theta$ and $\nu$. Here, $\epsilon$ controls the relative statistical error of the norm of the gradient while $\theta$ and $\nu$ control the relative statistical error of the gradient in the direction of the gradient and in the direction orthogonal to the gradient, respectively. Furthermore, we demonstrate that the inner product/orthogonality test can be as inexpensive as the norm test in the best case scenario if $\theta$ and $\nu$ are optimally selected, but the inner product/orthogonality test will never be more computationally affordable than the norm test if $\epsilon^2=\theta^2+\nu^2$. Finally, we present two stochastic optimization problems to illustrate our results.
Abstract（参考訳）: 本研究では,\epsilon^2=\theta^2+\nu^2}\,\theta$ および $\nu$ の特定の選択をした場合の確率的勾配降下 (sgd) 法に関連する収束率の観点から,ノルム検定と内積/直交性試験が等価であることを示す。ここで、$\epsilon$は勾配のノルムの相対統計誤差を制御し、$\theta$と$\nu$は勾配の方向と勾配の直交方向の相対統計誤差をそれぞれ制御する。さらに,もし$\theta$ と $\nu$ が最適に選択されれば,内積/オルトゴナリティテストは最善のケースではノルムテストと同じくらい安価になるが,内積/オルトゴナリティテストは$\epsilon^2=\theta^2+\nu^2$なら計算的に安くなることはない。最後に,2つの確率的最適化問題を提案する。

関連論文リスト

Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness [51.302674884611335]
本研究は、急勾配と条件勾配のアプローチを組み合わせることでノルムクリッピングを一般化するハイブリッド非ユークリッド最適化手法を提案する。本稿では、ディープラーニングのためのアルゴリズムのインスタンス化について論じ、画像分類と言語モデリングにおけるそれらの特性を実証する。
論文参考訳（メタデータ） (2025-06-02T17:34:29Z)
Convergence Rate Analysis of LION [54.28350823319057]
LION は、勾配カルシュ=クーン=T (sqrtdK-)$で測定された $cal(sqrtdK-)$ の反復を収束する。従来のSGDと比較して,LIONは損失が小さく,性能も高いことを示す。
論文参考訳（メタデータ） (2024-11-12T11:30:53Z)
DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning [58.79085525115987]
以前の研究でよく知られたユーティリティ境界は$widetilde O(d2/3/(nvarepsilon_mathrmDP)4/3)$である。本稿では,差分プライベートフレームワークを構築した mphDIFF2 (DIFFerential private via DIFFs) という新しい差分プライベートフレームワークを提案する。大域的な降下を持つ$mphDIFF2は$widetilde O(d2/3/(nvarepsilon_mathrmDP)4/3の効用を達成する
論文参考訳（メタデータ） (2023-02-08T05:19:01Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
滑らかな凸凹凸結合型サドル点問題, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$ を考える。漸進的勾配指数(AG-EG)降下指数アルゴリズムについて述べる。
論文参考訳（メタデータ） (2022-06-17T06:10:20Z)
Stochastic Zeroth order Descent with Structured Directions [10.604744518360464]
我々は, 有限差分法であるStructured Zeroth Order Descent (SSZD)を導入・解析し, 集合 $lleq d 方向の勾配を近似し, $d は周囲空間の次元である。凸凸に対して、すべての$c1/2$に対して$O( (d/l) k-c1/2$)$ 上の関数の収束はほぼ確実に証明する。
論文参考訳（メタデータ） (2022-06-10T14:00:06Z)
Generalization Bounds for Gradient Methods via Discrete and Continuous Prior [8.76346911214414]
次数$O(frac1n + fracL2nsum_t=1T(gamma_t/varepsilon_t)2)$の新たな高確率一般化境界を示す。また、あるSGDの変種に対する新しい境界を得ることもできる。
論文参考訳（メタデータ） (2022-05-27T07:23:01Z)
Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation [15.319335698574932]
The first efficient convergence result with primal-dual actor-critic with a convergence of $mathcalOleft ascent(Nright)Nright)$ under Polyian sample。 Open GymAI連続制御タスクの結果。
論文参考訳（メタデータ） (2022-02-28T15:16:23Z)
A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
凸凹対象 $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, ここで、$f$ は局所リプシッツ勾配を持つ凸関数であり、$g$ は凸かつ非滑らかである。主勾配ステップと2段ステップを交互に交互に行うCondat-Vuアルゴリズムの適応バージョンを提案する。
論文参考訳（メタデータ） (2021-10-28T14:19:30Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
我々は、勾配推定が末尾を持つ可能性のある一階アルゴリズムを用いたヒルベルト非最適化を考える。本研究では, 勾配, 運動量, 正規化勾配勾配の収束を高確率臨界点に収束させることと, 円滑な損失に対する最もよく知られた繰り返しを示す。
論文参考訳（メタデータ） (2021-06-28T00:17:01Z)
A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization [0.0]
減少勾配(SVRG)の性能を向上させるために, 分散制御勾配(VCSG)という新しい手法を提案する。ラムダ$はVCSGで導入され、SVRGによる分散の過剰還元を避ける。 $mathcalO(min1/epsilon3/2,n1/4/epsilon)$ 勾配評価の数。
論文参考訳（メタデータ） (2021-02-19T12:22:56Z)
A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent [85.77233010209368]
本稿では、データサンプルの数が$n$である現実的な環境で、ランダムフーリエ(RFF)回帰の正確さを特徴付けます。この分析はまた、大きな$n,p,N$のトレーニングとテスト回帰エラーの正確な推定も提供する。
論文参考訳（メタデータ） (2020-06-09T02:05:40Z)
Stochastic gradient-free descents [8.663453034925363]
本稿では,最適化問題の解法として,モーメント付き勾配法と加速勾配を提案する。本研究では,これらの手法の収束挙動を平均分散フレームワークを用いて解析する。
論文参考訳（メタデータ） (2019-12-31T13:56:36Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
適応アルゴリズムは勾配の歴史を用いて勾配を更新し、深層ニューラルネットワークのトレーニングにおいてユビキタスである。本稿では,非コンケーブ最小値問題に対するOptimisticOAアルゴリズムの変種を解析する。実験の結果,適応型GAN非適応勾配アルゴリズムは経験的に観測可能であることがわかった。
論文参考訳（メタデータ） (2019-12-26T22:10:10Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。