Fugu-MT 論文翻訳(概要): Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds

論文の概要: Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds

arxiv url: http://arxiv.org/abs/2509.13628v2
Date: Fri, 19 Sep 2025 13:19:22 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-22 12:06:46.397264
Title: Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds
Title（参考訳）: Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, Large Deviation Bounds
Authors: Mert Gürbüzbalaban, Yasa Syed, Necdet Serhat Aybat,
Abstract要約: 本研究では,収束率と強靭性への勾配のトレードオフについて,一階法の文脈で検討する。潜在的なバイアス付き準ガウス勾配誤差の下では、リスク・センシティブ・インデックス(RSI)の有限時間アナログ上の非漸近境界を導出する。滑らかな凸関数の場合、RSIと収束率境界との間の類似のトレードオフも観察する。
参考スコア（独自算出の注目度）: 12.025550076793396
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study trade-offs between convergence rate and robustness to gradient errors in the context of first-order methods. Our focus is on generalized momentum methods (GMMs)--a broad class that includes Nesterov's accelerated gradient, heavy-ball, and gradient descent methods--for minimizing smooth strongly convex objectives. We allow stochastic gradient errors that may be adversarial and biased, and quantify robustness of these methods to gradient errors via the risk-sensitive index (RSI) from robust control theory. For quadratic objectives with i.i.d. Gaussian noise, we give closed form expressions for RSI in terms of solutions to 2x2 matrix Riccati equations, revealing a Pareto frontier between RSI and convergence rate over the choice of step-size and momentum parameters. We then prove a large-deviation principle for time-averaged suboptimality in the large iteration limit and show that the rate function is, up to a scaling, the convex conjugate of the RSI function. We further show that the rate function and RSI are linked to the $H_\infty$-norm--a measure of robustness to the worst-case deterministic gradient errors--so that stronger worst-case robustness (smaller $H_\infty$-norm) leads to sharper decay of the tail probabilities for the average suboptimality. Beyond quadratics, under potentially biased sub-Gaussian gradient errors, we derive non-asymptotic bounds on a finite-time analogue of the RSI, yielding finite-time high-probability guarantees and non-asymptotic large-deviation bounds for the averaged iterates. In the case of smooth strongly convex functions, we also observe an analogous trade-off between RSI and convergence-rate bounds. To our knowledge, these are the first non-asymptotic guarantees for GMMs with biased gradients and the first risk-sensitive analysis of GMMs. Finally, we provide numerical experiments on a robust regression problem to illustrate our results.
Abstract（参考訳）: 収束率と勾配誤差に対する頑健性の間のトレードオフを1次手法の文脈で検討する。我々の焦点は一般化運動量法(GMM)であり、スムーズな凸目標を最小化するためにネステロフの加速勾配、重ボール、勾配降下法を含む幅広いクラスである。我々は, 確率的勾配誤差を逆数および偏りのあるものとし, 頑健な制御理論からリスク感応指数(RSI)を用いて, これらの手法の頑健さを定量化する。ガウス雑音の二次目的に対して、2x2行列 Riccati 方程式の解という観点から RSI に対して閉形式表現を与え、ステップサイズと運動量パラメータの選択に対する RSI と収束率の間のパレートフロンティアを明らかにする。次に, 時間平均的部分最適性に対する大域的決定原理を大反復極限で証明し, RSI関数の凸共役(convex conjungate)であることを示す。さらに、レート関数とRSIは、最低ケース決定的勾配誤差に対するロバスト性(英語版)の尺度である$H_\infty$-norm(英語版)とリンクしていることを示し、したがって、より強い最悪のケースのロバスト性(より小さい$H_\infty$-norm)は、平均的な準最適性に対するテール確率のより急激な崩壊をもたらす。二次性を超えて、潜在的にバイアスのある準ガウス勾配誤差の下では、RSIの有限時間アナログ上の非漸近境界を導出し、有限時間高確率保証と平均化されたイテレートに対する非漸近大偏差を与える。滑らかな凸関数の場合、RSIと収束率境界との間の類似のトレードオフも観察する。我々の知る限り、これらは偏りのあるGMMに対する最初の非漸近的保証であり、GMMのリスクに敏感な分析である。最後に,ロバスト回帰問題に関する数値実験を行い,その結果について述べる。

論文の概要: Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds

関連論文リスト