Fugu-MT 論文翻訳(概要): Towards Understanding Adam Convergence on Highly Degenerate Polynomials

論文の概要: Towards Understanding Adam Convergence on Highly Degenerate Polynomials

arxiv url: http://arxiv.org/abs/2603.09581v1
Date: Tue, 10 Mar 2026 12:30:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:24.302996
Title: Towards Understanding Adam Convergence on Highly Degenerate Polynomials
Title（参考訳）: 高退化多項式におけるアダム収束の理解に向けて
Authors: Zhiwei Bai, Jiajie Zhao, Zhangchen Zhou, Zhi-Qin John Xu, Yaoyu Zhang,
Abstract要約: アダムの「自然」自己収束特性について検討する。我々は、アダムがスケジューラを追加せずに自動的に収束する高度退化のクラスを見つける。アダムがこれらの関数の局所線型収束を退化させ、グラディエント Descent と Momentum の部分収束を著しく上回っていることを証明した。
参考スコア（独自算出の注目度）: 12.224244942795695
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adam is a widely used optimization algorithm in deep learning, yet the specific class of objective functions where it exhibits inherent advantages remains underexplored. Unlike prior studies requiring external schedulers and $β_2$ near 1 for convergence, this work investigates the "natural" auto-convergence properties of Adam. We identify a class of highly degenerate polynomials where Adam converges automatically without additional schedulers. Specifically, we derive theoretical conditions for local asymptotic stability on degenerate polynomials and demonstrate strong alignment between theoretical bounds and experimental results. We prove that Adam achieves local linear convergence on these degenerate functions, significantly outperforming the sub-linear convergence of Gradient Descent and Momentum. This acceleration stems from a decoupling mechanism between the second moment $v_t$ and squared gradient $g_t^2$, which exponentially amplifies the effective learning rate. Finally, we characterize Adam's hyperparameter phase diagram, identifying three distinct behavioral regimes: stable convergence, spikes, and SignGD-like oscillation.
Abstract（参考訳）: アダムはディープラーニングにおいて広く使われている最適化アルゴリズムであるが、固有の優位性を示す目的関数の特定のクラスは未探索のままである。外部スケジューラと収束のための$β_2$近辺1を必要とする以前の研究とは異なり、この研究はアダムの「自然な」自己収束特性を研究する。我々は、アダムが余分なスケジューラなしで自動的に収束する高退化多項式のクラスを同定する。具体的には、退化多項式の局所漸近安定性の理論条件を導出し、理論境界と実験結果の強い整合性を示す。我々はAdamがこれらの退化関数の局所線型収束を達成し、グラディエント Descent と Momentum の線型収束を著しく上回っていることを証明した。この加速は、第2モーメント$v_t$と2乗勾配$g_t^2$の分離機構に起因し、有効学習率を指数関数的に増幅する。最後に、アダムのハイパーパラメータ・フェーズ・ダイアグラムを特徴付け、安定収束、スパイク、SignGDのような振動の3つの異なる行動状態を特定する。

論文の概要: Towards Understanding Adam Convergence on Highly Degenerate Polynomials

関連論文リスト