Fugu-MT 論文翻訳(概要): ODE approximation for the Adam algorithm: General and overparametrized setting

論文の概要: ODE approximation for the Adam algorithm: General and overparametrized setting

arxiv url: http://arxiv.org/abs/2511.04622v1
Date: Thu, 06 Nov 2025 18:15:41 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-07 20:17:53.554057
Title: ODE approximation for the Adam algorithm: General and overparametrized setting
Title（参考訳）: AdamアルゴリズムのODE近似:一般化と過度パラメータ設定
Authors: Steffen Dereich, Arnulf Jentzen, Sebastian Kassing,
Abstract要約: 我々は、Adamアルゴリズムが特定のベクトル場の流れの擬似軌跡であることを示す。大域的ミニマの近傍では、目的関数がアダムベクトル場によって誘導される流れのリアプノフ関数として機能することを示す。
参考スコア（独自算出の注目度）: 2.765561545873517
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Adam optimizer is currently presumably the most popular optimization method in deep learning. In this article we develop an ODE based method to study the Adam optimizer in a fast-slow scaling regime. For fixed momentum parameters and vanishing step-sizes, we show that the Adam algorithm is an asymptotic pseudo-trajectory of the flow of a particular vector field, which is referred to as the Adam vector field. Leveraging properties of asymptotic pseudo-trajectories, we establish convergence results for the Adam algorithm. In particular, in a very general setting we show that if the Adam algorithm converges, then the limit must be a zero of the Adam vector field, rather than a local minimizer or critical point of the objective function. In contrast, in the overparametrized empirical risk minimization setting, the Adam algorithm is able to locally find the set of minima. Specifically, we show that in a neighborhood of the global minima, the objective function serves as a Lyapunov function for the flow induced by the Adam vector field. As a consequence, if the Adam algorithm enters a neighborhood of the global minima infinitely often, it converges to the set of global minima.
Abstract（参考訳）: AdamOptimatorは現在、ディープラーニングにおける最も一般的な最適化方法である。本稿では,高速なスケーリングシステムにおいて,Adamオプティマイザを研究するためのODEベースの手法を開発する。固定運動量パラメータやステップサイズがなくなる場合、アダムアルゴリズムは特定のベクトル場の流れの漸近的擬軌道であり、これはアダムベクトル場と呼ばれる。漸近的擬軌道の性質を利用して、Adamアルゴリズムの収束結果を確立する。特に、非常に一般的な設定では、アダムアルゴリズムが収束すると、極限は対象関数の局所最小化あるいは臨界点ではなく、アダムベクトル場の零点でなければならないことを示す。対照的に、過度にパラメータ化された経験的リスク最小化設定では、Adamアルゴリズムは局所的にミニマの集合を見つけることができる。具体的には、大域ミニマの近傍では、目的関数がアダムベクトル場によって誘導される流れのリアプノフ関数として機能することを示す。その結果、アダムアルゴリズムが無限に大域ミニマの近傍に入ると、それは大域ミニマの集合に収束する。

論文の概要: ODE approximation for the Adam algorithm: General and overparametrized setting

関連論文リスト