Fugu-MT 論文翻訳(概要): Uniform Generalization Bound on Time and Inverse Temperature for Gradient Descent Algorithm and its Application to Analysis of Simulated Annealing

論文の概要: Uniform Generalization Bound on Time and Inverse Temperature for Gradient Descent Algorithm and its Application to Analysis of Simulated Annealing

arxiv url: http://arxiv.org/abs/2205.12959v1
Date: Wed, 25 May 2022 01:21:27 GMT
ステータス: 翻訳完了
システム内更新日: 2022-05-27 13:56:26.002897
Title: Uniform Generalization Bound on Time and Inverse Temperature for Gradient Descent Algorithm and its Application to Analysis of Simulated Annealing
Title（参考訳）: 勾配降下アルゴリズムのための時間と逆温度に束縛された一様一般化とシミュレートアニーリング解析への応用
Authors: Keisuke Suzuki
Abstract要約: 勾配ランゲヴィン力学の時間と逆温度に縛られる新しい一様一般化を提案する。応用一般化として、有効性の評価を示す。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a novel uniform generalization bound on the time and inverse temperature for stochastic gradient Langevin dynamics (SGLD) in a non-convex setting. While previous works derive their generalization bounds by uniform stability, we use Rademacher complexity to make our generalization bound independent of the time and inverse temperature. Using Rademacher complexity, we can reduce the problem to derive a generalization bound on the whole space to that on a bounded region and therefore can remove the effect of the time and inverse temperature from our generalization bound. As an application of our generalization bound, an evaluation on the effectiveness of the simulated annealing in a non-convex setting is also described. For the sample size $n$ and time $s$, we derive evaluations with orders $\sqrt{n^{-1} \log (n+1)}$ and $|(\log)^4(s)|^{-1}$, respectively. Here, $(\log)^4$ denotes the $4$ times composition of the logarithmic function.
Abstract（参考訳）: 本稿では,非凸環境における確率勾配ランジュバンダイナミクス(sgld)の時間と逆温度に束縛された新しい一様一般化を提案する。以前の研究は、その一般化境界は均一な安定性によって導かれるが、Rademacher複雑性を用いて、一般化を時間と逆温度に依存しないものにしている。ラデマッハ複雑性を用いることで、空間全体に束縛された一般化から有界領域上の一般化を導出する問題を低減でき、したがって我々の一般化境界から時間と逆温度の影響を取り除くことができる。この一般化の適用例として、非凸設定におけるシミュレーションアニーリングの有効性について評価する。サンプルサイズ $n$ と time $s$ に対して、それぞれ$\sqrt{n^{-1} \log (n+1)}$ と $|(\log)^4(s)|^{-1}$ で評価を導出する。ここで、$(\log)^4$は対数関数の4ドル倍の構成を表す。

関連論文リスト

Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation [22.652143194356864]
ステップサイズが一定となる勾配勾配(SGD)アルゴリズムを用いて, 強い凸と滑らかな問題を解く問題に対処する。得られた推定子の平均二乗誤差を$n$の反復数に対して拡張する。我々は、この鎖が定義された重み付きワッサーシュタイン半計量に関して幾何学的にエルゴード的であることを確証する。
論文参考訳（メタデータ） (2024-10-07T15:02:48Z)
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent [63.43247232708004]
非同期で実行される勾配降下は、大規模機械学習モデルのトレーニングにおいて重要な役割を果たす。既存の一般化誤差境界は悲観的であり、非同期遅延と一般化の相関を明らかにすることはできない。我々の理論的結果は、非同期遅延は遅延SGDアルゴリズムの一般化誤差を低減することを示唆している。
論文参考訳（メタデータ） (2023-08-18T10:00:27Z)
Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization [50.83356836818667]
勾配ランゲヴィン・ダイナミクスは非エプス最適化問題を解くための最も基本的なアルゴリズムの1つである。本稿では、このタイプの2つの変種、すなわち、分散還元ランジュバンダイナミクスと再帰勾配ランジュバンダイナミクスを示す。
論文参考訳（メタデータ） (2022-03-30T11:39:00Z)
Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems [98.34292831923335]
オンライン相関解析の問題から,emphStochastic Scaled-Gradient Descent (SSD)アルゴリズムを提案する。我々はこれらのアイデアをオンライン相関解析に適用し、局所収束率を正規性に比例した最適な1時間スケールのアルゴリズムを初めて導いた。
論文参考訳（メタデータ） (2021-12-29T18:46:52Z)
Mean-Square Analysis with An Application to Optimal Dimension Dependence of Langevin Monte Carlo [60.785586069299356]
この研究は、2-ワッサーシュタイン距離におけるサンプリング誤差の非同相解析のための一般的な枠組みを提供する。我々の理論解析は数値実験によってさらに検証される。
論文参考訳（メタデータ） (2021-09-08T18:00:05Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
我々は、勾配推定が末尾を持つ可能性のある一階アルゴリズムを用いたヒルベルト非最適化を考える。本研究では, 勾配, 運動量, 正規化勾配勾配の収束を高確率臨界点に収束させることと, 円滑な損失に対する最もよく知られた繰り返しを示す。
論文参考訳（メタデータ） (2021-06-28T00:17:01Z)
Simulated annealing from continuum to discretization: a convergence analysis via the Eyring--Kramers law [10.406659081400354]
連続時間アニーリング$(X_t;, t ge 0)$とその離散化$(x_k;, k =0,1, ldots)$の収束率について検討する。我々は、テール確率 $mathbbP(f(X_t) > min f +delta)$ (resp. $mathP(f(x_k) > min f +delta)$) が時間内に崩壊することを証明する(累積ステップサイズでは、resp. $mathP(f(x_k) > min f +delta)$)。
論文参考訳（メタデータ） (2021-02-03T23:45:39Z)
Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling [110.88857917726276]
我々は,非log-concaveとなる分布のクラスからサンプリングするために,勾配ランゲヴィンダイナミクス(SGLD)の新たな収束解析を行う。我々のアプローチの核心は、補助的時間反転型マルコフ連鎖を用いたSGLDのコンダクタンス解析である。
論文参考訳（メタデータ） (2020-10-19T15:23:18Z)
Learning Complexity of Simulated Annealing [23.678288517578764]
本研究では,温度変化の基準,すなわち冷却スケジュールについて検討する。サンプルの複雑さとシミュレーションの複雑さの両面から肯定的な結果が得られる。本稿では,学習問題に対する証明可能な保証付き時間アルゴリズムを提案する。
論文参考訳（メタデータ） (2020-03-06T00:49:03Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。