Fugu-MT 論文翻訳(概要): Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods

論文の概要: Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods

arxiv url: http://arxiv.org/abs/2510.10777v1
Date: Sun, 12 Oct 2025 19:39:41 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:30.103827
Title: Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
Title（参考訳）: Preconditioned Norms: 最新・準ニュートン・適応的手法のための統一フレームワーク
Authors: Andrey Veprikov, Arman Bolatov, Samuel Horváth, Aleksandr Beznosikov, Martin Takáč, Slavomir Hanzely,
Abstract要約: 本稿では,事前条件付き行列ノルムの新たな概念を通じて,降下法,準ニュートン法,適応法を一般化する統一的枠組みを提案する。この枠組みでは、行列パラメータ化設定におけるアフィンとスケール不変性の最初の体系的処理を提供する。我々は、Muonのスペクトル幾何学とAdamスタイルのプレコンディショニングを組み合わせた、$ttMuAdam$と$texttMuAdam-SANIA$という2つの新しい方法を紹介した。
参考スコア（独自算出の注目度）: 50.070182958880146
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optimization lies at the core of modern deep learning, yet existing methods often face a fundamental trade-off between adapting to problem geometry and leveraging curvature utilization. Steepest descent algorithms adapt to different geometries through norm choices but remain strictly first-order, whereas quasi-Newton and adaptive optimizers incorporate curvature information but are restricted to Frobenius geometry, limiting their applicability across diverse architectures. In this work, we propose a unified framework generalizing steepest descent, quasi-Newton methods, and adaptive methods through the novel notion of preconditioned matrix norms. This abstraction reveals that widely used optimizers such as SGD and Adam, as well as more advanced approaches like Muon and KL-Shampoo, and recent hybrids including SOAP and SPlus, all emerge as special cases of the same principle. Within this framework, we provide the first systematic treatment of affine and scale invariance in the matrix-parameterized setting, establishing necessary and sufficient conditions under generalized norms. Building on this foundation, we introduce two new methods, $\texttt{MuAdam}$ and $\texttt{MuAdam-SANIA}$, which combine the spectral geometry of Muon with Adam-style preconditioning. Our experiments demonstrate that these optimizers are competitive with, and in some cases outperform, existing state-of-the-art methods. Our code is available at https://github.com/brain-lab-research/LIB/tree/quasi_descent
Abstract（参考訳）: 最適化は現代のディープラーニングの中核にあるが、既存の手法はしばしば問題幾何学への適応と曲率利用の活用の間に根本的なトレードオフに直面している。一方、準ニュートンと適応最適化器は曲率情報を組み込むが、フロベニウス幾何学に制限され、様々なアーキテクチャで適用が制限される。本研究では,急降下法,準ニュートン法,適応法を事前条件付き行列ノルムの概念により一般化する統一的枠組みを提案する。この抽象化は、SGDやAdamのような広く使われているオプティマイザや、MuonやKL-Shampooといったより高度なアプローチ、SOAPやSPlusといった最近のハイブリッドが、すべて同じ原則の特別なケースとして現れていることを明らかにする。この枠組みの中では、行列パラメータ設定におけるアフィンとスケール不変性の最初の体系的処理を提供し、一般化ノルムの下で必要かつ十分な条件を確立する。この基礎の上に、Muonのスペクトル幾何学とAdamスタイルのプレコンディショニングを組み合わせた、$\textt{MuAdam}$と$\texttt{MuAdam-SANIA}$という2つの新しい方法を導入する。実験の結果,これらのオプティマイザは既存の最先端手法よりも優れており,性能も優れていた。私たちのコードはhttps://github.com/brain-lab-research/LIB/tree/quasi_descentで利用可能です。

論文の概要: Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods

関連論文リスト