Fugu-MT 論文翻訳(概要): Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

論文の概要: Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

arxiv url: http://arxiv.org/abs/2110.06198v1
Date: Tue, 12 Oct 2021 17:49:54 GMT
ステータス: 翻訳完了
システム内更新日: 2021-10-13 13:41:27.566304
Title: Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
Title（参考訳）: 過パラメータ線形回帰のための崩壊段階を有するsgdの最後の反復的リスク境界
Authors: Jingfeng Wu and Difan Zou and Vladimir Braverman and Quanquan Gu and Sham M. Kakade
Abstract要約: 勾配降下(SGD)は、多くのディープラーニングアプリケーションでよく一般化されている。本稿では, 崩壊段階のSGDの最終反復リスク境界に関する問題依存解析を行う。
参考スコア（独自算出の注目度）: 122.70478935214128
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stochastic gradient descent (SGD) has been demonstrated to generalize well in many deep learning applications. In practice, one often runs SGD with a geometrically decaying stepsize, i.e., a constant initial stepsize followed by multiple geometric stepsize decay, and uses the last iterate as the output. This kind of SGD is known to be nearly minimax optimal for classical finite-dimensional linear regression problems (Ge et al., 2019), and provably outperforms SGD with polynomially decaying stepsize in terms of the statistical minimax rates. However, a sharp analysis for the last iterate of SGD with decaying step size in the overparameterized setting is still open. In this paper, we provide problem-dependent analysis on the last iterate risk bounds of SGD with decaying stepsize, for (overparameterized) linear regression problems. In particular, for SGD with geometrically decaying stepsize (or tail geometrically decaying stepsize), we prove nearly matching upper and lower bounds on the excess risk. Our results demonstrate the generalization ability of SGD for a wide class of overparameterized problems, and can recover the minimax optimal results up to logarithmic factors in the classical regime. Moreover, we provide an excess risk lower bound for SGD with polynomially decaying stepsize and illustrate the advantage of geometrically decaying stepsize in an instance-wise manner, which complements the minimax rate comparison made in previous work.
Abstract（参考訳）: 確率勾配降下(SGD)は、多くのディープラーニングアプリケーションでよく一般化することが示されている。実際には、しばしばsgdを幾何的に減衰するステップ、すなわち定数初期ステップ、そして複数の幾何学的ステップで実行し、最後のイテレートを出力として使用する。この種のSGDは古典的有限次元線形回帰問題(Ge et al., 2019)に最適に近い最小値であることが知られ(Ge et al., 2019)、統計的ミニマックス率の観点からは多項式減衰段数でSGDを確実に上回る。しかし、過パラメータ化条件におけるステップサイズが減衰した最後のSGDの急激な解析は未解決のままである。本稿では,線形回帰問題に対して,sgdの崩壊段階化を伴う最後の反復的リスク境界に関する問題依存分析を行う。特に、幾何学的に崩壊するステップズ(またはテールに崩壊するステップズ)を持つsgdでは、過剰なリスクの上限が上界と下界とほぼ一致することが証明される。以上の結果から,SGDの超パラメータ化問題に対する一般化能力を実証し,古典的状態の対数的要因まで最小値の最適値を復元できることを示した。さらに, sgd に対して, 多項式減衰ステップ化を伴う過大なリスク下限を与え, 先行研究における最小速度比較を補完する, インスタンス分割による幾何減衰ステップ化の利点を明らかにした。

関連論文リスト

Stochastic Gradient Descent in Non-Convex Problems: Asymptotic Convergence with Relaxed Step-Size via Stopping Time Methods [13.677904140815386]
Gradient Descent (SGD) は機械学習の研究で広く使われている。本稿では,より緩やかなステップサイズ条件下でのSGDの収束解析法を提案する。
論文参考訳（メタデータ） (2025-04-17T02:56:20Z)
A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression [22.917692982875025]
一階微分しか持たない関数を$f$で扱える新しいLyapunov関数を導入する。一般の減少段数と定数段数の下で有限時間モーメント境界を導出する。我々の結果は、特にオンライン統計手法に広く応用されている。
論文参考訳（メタデータ） (2025-04-11T00:20:37Z)
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation [22.652143194356864]
ステップサイズが一定となる勾配勾配(SGD)アルゴリズムを用いて, 強い凸と滑らかな問題を解く問題に対処する。得られた推定子の平均二乗誤差を$n$の反復数に対して拡張する。我々は、この鎖が定義された重み付きワッサーシュタイン半計量に関して幾何学的にエルゴード的であることを確証する。
論文参考訳（メタデータ） (2024-10-07T15:02:48Z)
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability [69.01076284478151]
機械学習の最適化において、勾配降下(GD)はしばしば安定性の端(EoS)で動く本稿では,EoS系における線形分離可能なデータに対するロジスティック回帰のための定数段差GDの収束と暗黙バイアスについて検討する。
論文参考訳（メタデータ） (2023-05-19T16:24:47Z)
The Benefits of Implicit Regularization from SGD in Least Squares Problems [116.85246178212616]
勾配降下(SGD)はアルゴリズム正則化効果が強い。我々は、(正規化されていない)平均SGDで得られる暗黙の正則化とリッジ回帰の明示的な正則化の比較を行う。
論文参考訳（メタデータ） (2021-08-10T09:56:47Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
帰納バイアスは経験的に過剰フィットを防げる中心的存在ですこの研究は、この問題を最も基本的な設定として考慮している: 線形回帰に対する定数ステップサイズ SGD。我々は、(正規化されていない)SGDで得られるアルゴリズム正則化と、通常の最小二乗よりも多くの顕著な違いを反映する。
論文参考訳（メタデータ） (2021-03-23T17:15:53Z)
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality [15.640534097470923]
本稿では,サンプル数と寸法がともに大きい場合の勾配降下(SGD)のダイナミクスを解析するための新しい枠組みを提案する。この新たな枠組みを用いて, ランダムデータを用いた最小二乗問題におけるSGDの力学が, 標本および次元限界において決定論的になることを示す。
論文参考訳（メタデータ） (2021-02-08T18:00:13Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
本研究は,ステップサイズの減衰が正確な収束に必要であるという事実と,一定のステップサイズがエラーまでの時間でより速く学習するという事実のバランスをとることを目的とする。ステップサイズのミニバッチを最初から修正するのではなく,パラメータを適応的に進化させることを提案する。
論文参考訳（メタデータ） (2020-07-02T16:02:02Z)
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems [75.58134963501094]
本稿では,勾配降下(SGD)の軌跡を解析する。我々はSGDが厳格なステップサイズポリシーのために1ドルでサドルポイント/マニフォールドを避けることを示す。
論文参考訳（メタデータ） (2020-06-19T14:11:26Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。