Fugu-MT 論文翻訳(概要): Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points

論文の概要: Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points

arxiv url: http://arxiv.org/abs/2211.00866v1
Date: Wed, 2 Nov 2022 04:23:31 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-03 14:42:29.759527
Title: Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points
Title（参考訳）: 勾配降下とパワー法:その接続を利用して最左端の固有ペアを見つけ、サドルポイントから脱出する
Authors: Rachael Tappenden and Martin Tak\'a\v{c}
Abstract要約: この研究は、(おそらく非)二次函数に一定のステップサイズでグラディエント(GD)を適用することは、勾配上でパワーメソッド(PM)を実行することと等価であることを示している。
参考スコア（独自算出の注目度）: 0.456877715768796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work shows that applying Gradient Descent (GD) with a fixed step size to minimize a (possibly nonconvex) quadratic function is equivalent to running the Power Method (PM) on the gradients. The connection between GD with a fixed step size and the PM, both with and without fixed momentum, is thus established. Consequently, valuable eigen-information is available via GD. Recent examples show that GD with a fixed step size, applied to locally quadratic nonconvex functions, can take exponential time to escape saddle points (Simon S. Du, Chi Jin, Jason D. Lee, Michael I. Jordan, Aarti Singh, and Barnabas Poczos: "Gradient descent can take exponential time to escape saddle points"; S. Paternain, A. Mokhtari, and A. Ribeiro: "A newton-based method for nonconvex optimization with fast evasion of saddle points"). Here, those examples are revisited and it is shown that eigenvalue information was missing, so that the examples may not provide a complete picture of the potential practical behaviour of GD. Thus, ongoing investigation of the behaviour of GD on nonconvex functions, possibly with an \emph{adaptive} or \emph{variable} step size, is warranted. It is shown that, in the special case of a quadratic in $R^2$, if an eigenvalue is known, then GD with a fixed step size will converge in two iterations, and a complete eigen-decomposition is available. By considering the dynamics of the gradients and iterates, new step size strategies are proposed to improve the practical performance of GD. Several numerical examples are presented, which demonstrate the advantages of exploiting the GD--PM connection.
Abstract（参考訳）: この研究は、(おそらく非凸な)二次関数を最小化するために、一定のステップサイズで勾配降下(gd)を適用することは、勾配上でパワー法(pm)を実行することと同値であることを示している。これにより、GDとPMとの接続は、固定運動量と非固定運動量の両方で確立される。そのため、貴重な固有情報もgd経由で入手できる。最近の例では、局所二次非凸関数に適用される固定ステップサイズのgdは、サドル点を脱出するのに指数関数的に時間がかかる(simon s. du, chi jin, jason d. lee, michael i. jordan, aarti singh, barnabas poczos: "gradient descend can takes exponential time to escape saddle points"; s. paternain, a. mokhtari, a. ribeiro: "a newton-based method for nonconvex optimization with fast evasion of saddle point")。ここで、これらの例は再検討され、固有値情報が欠落していることが示され、gdの潜在的な実用的振る舞いの完全な図示が得られない。したがって、非凸函数上の gd の挙動(おそらくは \emph{adaptive} または \emph{variable} ステップサイズ)の現在進行中の調査が保証される。 r^2$ の特別な場合において、固有値が知られている場合、固定されたステップサイズを持つ gd は2回のイテレーションで収束し、完全な固有分解が可能であることが示されている。グラデーションとイテレートのダイナミクスを考慮することで,gdの実用性を向上させるための新しいステップサイズ戦略が提案されている。 GD-PM接続を利用する利点を示すいくつかの数値例を示す。

関連論文リスト

Effective Dimension Aware Fractional-Order Stochastic Gradient Descent for Convex Optimization Problems [2.5971517743176915]
データ駆動方式で分数指数を適応する2SED分数次勾配Descent (2SEDFOSGD)を提案する。理論的には、この手法は、na"ive fractional SGD"で観察されるスラグや不安定な振る舞いを伴わない分数記憶の利点を保っている。
論文参考訳（メタデータ） (2025-03-17T22:57:37Z)
Gradient Descent Methods for Regularized Optimization [0.6624754673303327]
勾配降下法(GD法)は、微分可能な対象関数の数値最適化に使用される主要な手法の1つである。 GDのより効果的なバージョンは、近位勾配降下と呼ばれ、ソフトスレッディング(Soft-thresholding)と呼ばれる技術を用いて、イテレーション更新をゼロに縮小する。本稿では, 可変ステップサイズを組み込んだ近位GD法のための新しいアルゴリズムを提案する。
論文参考訳（メタデータ） (2024-12-28T10:54:15Z)
Bridging Discrete and Backpropagation: Straight-Through and Beyond [62.46558842476455]
本稿では,離散潜在変数の生成に関わるパラメータの勾配を近似する新しい手法を提案する。本稿では,Hunの手法とODEを解くための2次数値法を統合することで,2次精度を実現するReinMaxを提案する。
論文参考訳（メタデータ） (2023-04-17T20:59:49Z)
Graph Signal Sampling for Inductive One-Bit Matrix Completion: a Closed-form Solution [112.3443939502313]
グラフ信号解析と処理の利点を享受する統合グラフ信号サンプリングフレームワークを提案する。キーとなる考え方は、各ユーザのアイテムのレーティングをアイテムイットグラフの頂点上の関数(信号)に変換することである。オンライン設定では、グラフフーリエ領域における連続ランダムガウス雑音を考慮したベイズ拡張(BGS-IMC)を開発する。
論文参考訳（メタデータ） (2023-02-08T08:17:43Z)
Convergence of Batch Stochastic Gradient Descent Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results [0.9900482274337404]
BSGD(Block Gradient Descent)と呼ばれる非常に一般的な定式化を用いた凸最適化の研究我々は近似理論に基づいて,BSGDが世界最小値に収束する条件を確立する。近似勾配を用いると、BSGDは収束し、運動量に基づく手法は分岐できることを示す。
論文参考訳（メタデータ） (2022-09-12T16:23:15Z)
Formal guarantees for heuristic optimization algorithms used in machine learning [6.978625807687497]
グラディエント・Descent(SGD)とその変種は、大規模最適化機械学習(ML)問題において支配的な手法となっている。本稿では,いくつかの凸最適化手法の形式的保証と改良アルゴリズムの提案を行う。
論文参考訳（メタデータ） (2022-07-31T19:41:22Z)
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling [6.950316788263433]
我々は、勾配降下(SGD)の要約統計の軌跡に対する極限定理を証明する。下記の有効弾道力学が人口減少の勾配流と一致するステップサイズにおける重要なスケーリング体制を示す。この実効力学の固定点について、対応する拡散極限は極めて複雑であり、さらに退化することもある。
論文参考訳（メタデータ） (2022-06-08T17:42:18Z)
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression [122.70478935214128]
勾配降下(SGD)は、多くのディープラーニングアプリケーションでよく一般化されている。本稿では, 崩壊段階のSGDの最終反復リスク境界に関する問題依存解析を行う。
論文参考訳（メタデータ） (2021-10-12T17:49:54Z)
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate [105.62979485062756]
本稿では,中等度学習におけるSGDの特定の正規化効果を特徴付けることを試みる。 SGDはデータ行列の大きな固有値方向に沿って収束し、GDは小さな固有値方向に沿って収束することを示す。
論文参考訳（メタデータ） (2020-11-04T21:07:52Z)
A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints [10.578409461429626]
本研究では、滑らかで強可変なゲームやイテレーションのための一階法に積分二次的制約理論を適用する。我々は、負の運動量法(NM)に対して、既知の下界と一致する複雑性$mathcalO(kappa1.5)$で、初めて大域収束率を与える。一段階のメモリを持つアルゴリズムでは,バッチ毎に1回だけ勾配を問合せすれば,高速化は不可能であることを示す。
論文参考訳（メタデータ） (2020-09-23T20:02:00Z)
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems [75.58134963501094]
本稿では,勾配降下(SGD)の軌跡を解析する。我々はSGDが厳格なステップサイズポリシーのために1ドルでサドルポイント/マニフォールドを避けることを示す。
論文参考訳（メタデータ） (2020-06-19T14:11:26Z)
Carath\'eodory Sampling for Stochastic Gradient Descent [79.55586575988292]
本稿では,Tchakaloff と Carath'eodory の古典的な結果から着想を得た手法を提案する。我々は、測定値の低減を行う降下ステップを適応的に選択する。これをBlock Coordinate Descentと組み合わせることで、測定の削減を極めて安価に行えるようにします。
論文参考訳（メタデータ） (2020-06-02T17:52:59Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。