Fugu-MT 論文翻訳(概要): An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

論文の概要: An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

arxiv url: http://arxiv.org/abs/2407.04358v1
Date: Fri, 5 Jul 2024 08:53:06 GMT
ステータス: 翻訳完了
システム内更新日: 2024-07-08 14:00:02.030018
Title: An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Title（参考訳）: 非負ガウスニュートンステップサイズをもつ適応確率勾配法
Authors: Antonio Orvieto, Lin Xiao,
Abstract要約: 機械学習の応用では、各損失関数は非負であり、平方根とその実数値平方根の構成として表すことができる。本稿では, ガウス・ニュートン法やレフスカルト法を適用して, 滑らかだが非負な関数の平均を最小化する方法を示す。
参考スコア（独自算出の注目度）: 17.804065824245402
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider the problem of minimizing the average of a large number of smooth but possibly non-convex functions. In the context of most machine learning applications, each loss function is non-negative and thus can be expressed as the composition of a square and its real-valued square root. This reformulation allows us to apply the Gauss-Newton method, or the Levenberg-Marquardt method when adding a quadratic regularization. The resulting algorithm, while being computationally as efficient as the vanilla stochastic gradient method, is highly adaptive and can automatically warmup and decay the effective stepsize while tracking the non-negative loss landscape. We provide a tight convergence analysis, leveraging new techniques, in the stochastic convex and non-convex settings. In particular, in the convex case, the method does not require access to the gradient Lipshitz constant for convergence, and is guaranteed to never diverge. The convergence rates and empirical evaluations compare favorably to the classical (stochastic) gradient method as well as to several other adaptive methods.
Abstract（参考訳）: 多数の滑らかだが非凸関数の平均を最小化する問題を考える。ほとんどの機械学習アプリケーションの文脈では、各損失関数は非負であり、従って平方根とその実数値平方根の合成として表すことができる。この再構成により、二次正則化を加える際にガウス・ニュートン法やレバンス・マルカルト法を適用することができる。得られたアルゴリズムは、バニラ確率勾配法と同等に計算効率が良いが、適応性が高く、非負のロスランドスケープを追尾しながら、有効段差を自動的にウォームアップして減衰させることができる。我々は、確率凸および非凸設定において、新しい手法を活用する厳密な収束解析を提供する。特に凸の場合、この方法は収束のために勾配リプシッツ定数へのアクセスを必要とせず、決して分岐しないことが保証される。収束率と経験的評価は、古典的(確率的な)勾配法や、他のいくつかの適応法と好意的に比較できる。

関連論文リスト

Symmetric Rank-One Quasi-Newton Methods for Deep Learning Using Cubic Regularization [0.5120567378386615]
アダムやアダグラッドのような一階降下や他の一階変種は、ディープラーニングの分野で一般的に使われている。しかし、これらの手法は曲率情報を活用しない。準ニュートン法は、以前計算された低ヘッセン近似を再利用する。
論文参考訳（メタデータ） (2025-02-17T20:20:11Z)
Fast Unconstrained Optimization via Hessian Averaging and Adaptive Gradient Sampling Methods [0.3222802562733786]
ヘシアン・アブラッシングに基づくサブサンプルニュートン法による有限サム予測対象関数の最小化について検討する。これらの方法は不有効であり、ヘッセン近似の固定コストがかかる。本稿では,新しい解析手法を提案し,その実用化に向けた課題を提案する。
論文参考訳（メタデータ） (2024-08-14T03:27:48Z)
A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees [7.08249229857925]
勾配 Hessian に不連続な滑らかな非オラクル関数の最小化を考える。提案手法の新たな特徴は, 負曲率の近似方向が選択された場合, 感覚緩和を等勾配で負となるように選択することである。
論文参考訳（メタデータ） (2023-10-28T22:57:56Z)
Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
凸法と非最適化法の分析は、しばしばリプシッツ勾配を必要とし、この軌道による解析を制限する。最近の研究は、非一様滑らか性条件を通した勾配設定を一般化している。
論文参考訳（メタデータ） (2023-06-02T04:21:59Z)
Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation Constrained Optimization [88.0031283949404]
多くの実世界の問題は複雑な非機能的制約を持ち、多くのデータポイントを使用する。提案手法は,従来最もよく知られた結果で既存手法よりも優れた性能を示す。
論文参考訳（メタデータ） (2022-12-19T14:48:54Z)
Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data [7.513100214864646]
収束$tildeO(t-1/4)$とMoreautildeO(vareps-4)$がスムーズな非最適化のために最悪の場合の複雑性を示す。適応的なステップサイズと最適収束度を持つ投影勾配法に基づく従属データに対する最初のオンライン非負行列分解アルゴリズムを得る。
論文参考訳（メタデータ） (2022-03-29T17:59:10Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sample (AIS) と関連するアルゴリズムは、限界推定のための非常に効果的なツールである。差別性は、目的として限界確率を最適化する可能性を認めるため、望ましい性質である。我々はメトロポリス・ハスティングスのステップを放棄して微分可能アルゴリズムを提案し、ミニバッチ計算をさらに解き放つ。
論文参考訳（メタデータ） (2021-07-21T17:10:14Z)
Constrained and Composite Optimization via Adaptive Sampling Methods [3.4219044933964944]
本論文の動機は,制約付き最適化問題を解くための適応サンプリング手法を開発することにある。本論文で提案する手法は、f が凸(必ずしも微分可能ではない)である合成最適化問題 min f(x) + h(x) にも適用できる近位勾配法である。
論文参考訳（メタデータ） (2020-12-31T02:50:39Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
この作業は、一階情報を必要としない零次最適化(ZO)の反復である。座標重要度サンプリングにおける優雅な設計により,ZO最適化法は複雑度と関数クエリコストの両面において効率的であることを示す。
論文参考訳（メタデータ） (2020-12-21T17:29:58Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
適応アルゴリズムは勾配の歴史を用いて勾配を更新し、深層ニューラルネットワークのトレーニングにおいてユビキタスである。本稿では,非コンケーブ最小値問題に対するOptimisticOAアルゴリズムの変種を解析する。実験の結果,適応型GAN非適応勾配アルゴリズムは経験的に観測可能であることがわかった。
論文参考訳（メタデータ） (2019-12-26T22:10:10Z)
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization [80.03647903934723]
我々は、勾配収束法を期待する適応勾配法を証明した。解析では、非理解勾配境界の最適化において、より適応的な勾配法に光を当てた。
論文参考訳（メタデータ） (2018-08-16T20:25:28Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。