Fugu-MT 論文翻訳(概要): ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization

論文の概要: ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization

arxiv url: http://arxiv.org/abs/2006.07065v1
Date: Fri, 12 Jun 2020 10:39:25 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-22 04:45:31.738351
Title: ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization
Title（参考訳）: ACMo:確率最適化のための角度校正モーメント法
Authors: Xunpeng Huang, Runxin Xu, Hao Zhou, Zhe Wang, Zhengyang Liu and Lei Li
Abstract要約: 勾配降下法(SGD)は, 収束が遅いにもかかわらず, 依然として最も広く用いられている最適化法である。適応的手法は最適化と機械学習コミュニティの注目を集めている。両方の世界のベストを尽くすことは、機械学習の最適化分野における最もエキサイティングで挑戦的な問題だ。
参考スコア（独自算出の注目度）: 27.11997724023977
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to its simplicity and outstanding ability to generalize, stochastic gradient descent (SGD) is still the most widely used optimization method despite its slow convergence. Meanwhile, adaptive methods have attracted rising attention of optimization and machine learning communities, both for the leverage of life-long information and for the profound and fundamental mathematical theory. Taking the best of both worlds is the most exciting and challenging question in the field of optimization for machine learning. Along this line, we revisited existing adaptive gradient methods from a novel perspective, refreshing understanding of second moments. Our new perspective empowers us to attach the properties of second moments to the first moment iteration, and to propose a novel first moment optimizer, \emph{Angle-Calibrated Moment method} (\method). Our theoretical results show that \method is able to achieve the same convergence rate as mainstream adaptive methods. Furthermore, extensive experiments on CV and NLP tasks demonstrate that \method has a comparable convergence to SOTA Adam-type optimizers, and gains a better generalization performance in most cases.
Abstract（参考訳）: その単純さと一般化能力により、確率勾配降下 (sgd) は収束が遅いにもかかわらず最も広く使われている最適化手法である。一方、適応的手法は、生涯情報の活用と深遠で基本的な数学的理論の両面において、最適化と機械学習のコミュニティの注目を集めている。両方の世界のベストを尽くすことは、機械学習の最適化分野における最もエキサイティングで難しい問題だ。そこで本研究では,既存の適応勾配法を新たな視点から再検討し,第2モーメントの理解を深めた。新しい視点により、第1モーメントイテレーションに第2モーメントの特性をアタッチし、新しい第1モーメントオプティマイザである \emph{angle-calibrated moment method} (\method) を提案する。理論的な結果から, \method は主流適応法と同じ収束率を達成できることがわかった。さらに、CVおよびNLPタスクに関する広範な実験により、ShamethodはSOTAアダム型最適化器に匹敵する収束性を示し、ほとんどの場合においてより良い一般化性能を得る。

関連論文リスト

Revisiting the Initial Steps in Adaptive Gradient Descent Optimization [6.468625143772815]
Adamのような適応的な勾配最適化手法は、さまざまな機械学習タスクにわたるディープニューラルネットワークのトレーニングで広く使われている。これらの手法は、降下勾配 (SGD) と比較して最適下一般化に苦しむことが多く、不安定性を示す。非ゼロ値で2階モーメント推定を初期化する。
論文参考訳（メタデータ） (2024-12-03T04:28:14Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
我々は,現代のディープラーニングにおいて広く普及している一般的なメタ学習問題に対処する。これらの問題は、しばしばBi-Level Optimizations (BLO)として定式化される。我々は,与えられたBLO問題を,内部損失関数が滑らかな分布となり,外損失が内部分布に対する期待損失となるようなii最適化に変換することにより,新たな視点を導入する。
論文参考訳（メタデータ） (2024-10-14T12:10:06Z)
ODE-based Learning to Optimize [28.380622776436905]
我々は、慣性系とヘッセン駆動制振方程式(ISHD)を統合した包括的枠組みを提案する。収束・安定条件を考慮した停止時間を最小化することを目的とした新しい学習法(L2O)を定式化する。本フレームワークの実証検証は,多種多様な最適化問題に対する広範な数値実験を通じて行われる。
論文参考訳（メタデータ） (2024-06-04T06:39:45Z)
ELRA: Exponential learning rate adaption gradient descent optimization method [83.88591755871734]
我々は, 高速(指数率), ab initio(超自由)勾配に基づく適応法を提案する。本手法の主な考え方は,状況認識による$alphaの適応である。これは任意の次元 n の問題に適用でき、線型にしかスケールできない。
論文参考訳（メタデータ） (2023-09-12T14:36:13Z)
Adapting Stepsizes by Momentumized Gradients Improves Optimization and Generalization [89.66571637204012]
textscAdaMomentum on vision, and achieves state-the-art results on other task including language processing。 textscAdaMomentum on vision, and achieves state-the-art results on other task including language processing。 textscAdaMomentum on vision, and achieves state-the-art results on other task including language processing。
論文参考訳（メタデータ） (2021-06-22T03:13:23Z)
SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
一般的な問題を解決するための適応アルゴリズムのための普遍的な枠組みを設計することが望まれる。特に,本フレームワークは,非収束的設定支援の下で適応的手法を提供する。
論文参考訳（メタデータ） (2021-06-15T15:16:28Z)
A Discrete Variational Derivation of Accelerated Methods in Optimization [68.8204255655161]
最適化のための異なる手法を導出できる変分法を導入する。我々は1対1の対応において最適化手法の2つのファミリを導出する。自律システムのシンプレクティシティの保存は、ここでは繊維のみに行われる。
論文参考訳（メタデータ） (2021-06-04T20:21:53Z)
Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
目的関数の非一様洗練は、emphNon-uniform Smoothness(NS)とemphNon-uniform Lojasiewicz inequality(NL)につながる新しい定義は、古典的な$Omega (1/t2)$下界よりも早く大域的最適性に収束する新しい幾何学的一階法を刺激する。
論文参考訳（メタデータ） (2021-05-13T04:23:07Z)
Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent [20.47598828422897]
第一次下降法における学習率の適応的選択のための新しいアプローチであるtextit-Meta-Regularizationを提案する。本手法は,正規化項を追加して目的関数を修正し,共同処理パラメータをキャストする。
論文参考訳（メタデータ） (2021-04-12T13:13:34Z)
Acceleration Methods [57.202881673406324]
まず2次最適化問題を用いて加速法を2つ導入する。我々は、ネステロフの精巧な研究から始まる運動量法を詳細に論じる。我々は、ほぼ最適な収束率に達するための一連の簡単な手法である再起動スキームを議論することで結論付ける。
論文参考訳（メタデータ） (2021-01-23T17:58:25Z)
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities [32.51470158863247]
AdaACSAとAdaAGD+は制約付き凸最適化の高速化手法である。我々はこれらを、同じ特徴を享受し、標準の非加速収束率を達成する、より単純なアルゴリズムAdaGrad+で補完する。
論文参考訳（メタデータ） (2020-07-17T09:10:21Z)
Adaptive Gradient Methods Can Be Provably Faster than SGD after Finite Epochs [25.158203665218164]
適応勾配法は有限時間後にランダムシャッフルSGDよりも高速であることを示す。我々の知る限り、適応的勾配法は有限時間後にSGDよりも高速であることを示すのはこれが初めてである。
論文参考訳（メタデータ） (2020-06-12T09:39:47Z)
Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic Optimization Problems [12.010310883787911]
我々は、弱凸(おそらく非滑らかな)最適化問題の重要なクラスを解くための、適応的な段階的な新しい手法の族を解析する。実験結果から,提案アルゴリズムが0次勾配降下と設計変動を経験的に上回ることを示す。
論文参考訳（メタデータ） (2020-05-19T07:44:52Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。