Fugu-MT 論文翻訳(概要): Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

論文の概要: Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

arxiv url: http://arxiv.org/abs/2606.04031v1
Date: Mon, 01 Jun 2026 20:42:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.250844
Title: Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent
Title（参考訳）: 重み付きグラディエントDescenceにおける過渡増幅のための擬似スペクトル境界
Authors: Ahanaf Hasan Ariq,
Abstract要約: ブロック三角形ヤコビアンに対する擬スペクトル理論を開発する。 Kreiss 定数は、対角ブロックがスペクトル半径で最大 1$ の対称であるとき、$K(J) leq 2/ (1-) + |C|/(4 (1-))$ を満たすことを示す。線形四元数問題、IQCに基づく比較、ニューラル・ネットワーク・トレーニングの実験は、この理論を裏付けるものである。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Coupled gradient descent--where the update of one parameter block depends on another--underlies bilevel optimization, two-time-scale stochastic approximation, and adversarial training. When the coupled Jacobian is block-triangular, asymptotic stability is governed by the spectral radii of the diagonal blocks, yet transient amplification before convergence can be arbitrarily large due to non-normality. We develop a sharp pseudospectral theory for such block-triangular Jacobians, proving that the Kreiss constant satisfies $K(J) \leq 2/(1-γ) + \|C\|/(4(1-γ))$ when the diagonal blocks are symmetric with spectral radii at most $γ< 1$, and we establish matching minimax lower bounds. We characterize the critical coupling threshold for spectral instability and extend the analysis to nearly self-referential systems via a Neumann-series perturbation framework. As a consequence, we obtain a finite-horizon iteration-complexity bound of $O(K(J)^2 \log(1/δ))$ for stochastic coupled descent. Framed as scaling laws for non-stationary two-time-scale optimization, our results expose a non-asymptotic, instance-dependent regime of high-dimensional learning dynamics that is invisible to spectral-radius analysis. Experiments on linear-quadratic problems, IQC-based comparisons, and neural-network training confirm the theory.
Abstract（参考訳）: 結合勾配降下-あるパラメータブロックの更新は、他のパラメータブロックの最適化、二段階確率近似、および対向訓練に依存する。結合したヤコビアンがブロック三角形であるとき、漸近安定性は対角ブロックのスペクトル半径によって支配されるが、収束前の過渡増幅は非正規性のために任意に大きい。そのようなブロック三角形ヤコビアンに対する鋭い擬スペクトル理論を開発し、K(J) 定数が$K(J) \leq 2/(1-γ) + \|C\|/(4(1-γ))$ であることを示す。スペクトル不安定性の臨界結合閾値を特徴付け,解析をノイマン系列摂動フレームワークを用いてほぼ自己参照系に拡張する。その結果、確率的結合降下に対して$O(K(J)^2 \log(1/δ))$の有限水平反復複素性境界が得られる。非定常2時間スケール最適化のスケーリング法則として評価され、スペクトルラディウス解析では見えないような非漸近的、インスタンス依存の高次元学習力学の規則を明らかにする。線形四元数問題、IQCに基づく比較、ニューラル・ネットワーク・トレーニングの実験は、この理論を裏付けるものである。

関連論文リスト

Regularized Online RLHF with Generalized Bilinear Preferences [68.44113000390544]
一般的な嗜好を伴う文脈的オンラインRLHFの問題を考える。一般化された双線形選好モデルを用いて、低ランクなスキュー対称行列による選好を捉える。グリーディポリシーの双対ギャップは推定誤差の正方形によって有界であることを示す。
論文参考訳（メタデータ） (2026-02-26T15:27:53Z)
Euclidean Distance Matrix Completion via Asymmetric Projected Gradient Descent [25.846262685970164]
本稿では,Burer-Monteiro因子化に基づく勾配型アルゴリズムの提案と解析を行う。部分ユークリッド距離測定から点集合構成を再構成する。
論文参考訳（メタデータ） (2025-04-28T07:13:23Z)
KPZ scaling from the Krylov space [83.88591755871734]
近年,Cardar-Parisi-Zhangスケーリングをリアルタイムの相関器や自動相関器に示す超拡散が報告されている。これらの結果から着想を得て,Krylov演算子に基づく相関関数のKPZスケーリングについて検討する。
論文参考訳（メタデータ） (2024-06-04T20:57:59Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
本稿では,ニューラルネットワークトレーニングを安定化(大規模)するための原理的手法として,線形アヘッドの理論解析を提案する。最適化過程の不安定性は、しばしば損失ランドスケープの非単調性によって引き起こされるものであり、非拡張作用素の理論を活用することによって線型性がいかに役立つかを示す。
論文参考訳（メタデータ） (2023-10-20T12:45:12Z)
Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization [26.218670461973705]
非漸近勾配ノルム保証を協調する問題の解法を提案する。本研究は,ニューラルネットの深部学習における循環還元方式の有効性を実証するものである。
論文参考訳（メタデータ） (2022-12-09T19:17:39Z)
Mean-Square Analysis with An Application to Optimal Dimension Dependence of Langevin Monte Carlo [60.785586069299356]
この研究は、2-ワッサーシュタイン距離におけるサンプリング誤差の非同相解析のための一般的な枠組みを提供する。我々の理論解析は数値実験によってさらに検証される。
論文参考訳（メタデータ） (2021-09-08T18:00:05Z)
Last iterate convergence of SGD for Least-Squares in the Interpolation regime [19.05750582096579]
基本最小二乗構成におけるノイズレスモデルについて検討する。最適予測器が完全に入力に適合すると仮定し、$langletheta_*, phi(X) rangle = Y$, ここで$phi(X)$は無限次元の非線型特徴写像を表す。
論文参考訳（メタデータ） (2021-02-05T14:02:20Z)
A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints [10.578409461429626]
本研究では、滑らかで強可変なゲームやイテレーションのための一階法に積分二次的制約理論を適用する。我々は、負の運動量法(NM)に対して、既知の下界と一致する複雑性$mathcalO(kappa1.5)$で、初めて大域収束率を与える。一段階のメモリを持つアルゴリズムでは,バッチ毎に1回だけ勾配を問合せすれば,高速化は不可能であることを示す。
論文参考訳（メタデータ） (2020-09-23T20:02:00Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。