Fugu-MT 論文翻訳(概要): Tight Risk Bounds for Gradient Descent on Separable Data

論文の概要: Tight Risk Bounds for Gradient Descent on Separable Data

arxiv url: http://arxiv.org/abs/2303.01135v1
Date: Thu, 2 Mar 2023 10:31:58 GMT
ステータス: 翻訳完了
システム内更新日: 2023-03-03 14:55:57.303365
Title: Tight Risk Bounds for Gradient Descent on Separable Data
Title（参考訳）: 分離データに基づく勾配沈み込みの厳密なリスク境界
Authors: Matan Schliserman and Tomer Koren
Abstract要約: 分離線形分類に適用した非正規化勾配法の一般化特性について検討した。リスク低い境界は、この文脈で最初のものであり、与えられたテール崩壊率に対する上限の厳密性を確立する。
参考スコア（独自算出の注目度）: 33.593203156666746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the generalization properties of unregularized gradient methods applied to separable linear classification -- a setting that has received considerable attention since the pioneering work of Soudry et al. (2018). We establish tight upper and lower (population) risk bounds for gradient descent in this setting, for any smooth loss function, expressed in terms of its tail decay rate. Our bounds take the form $\Theta(r_{\ell,T}^2 / \gamma^2 T + r_{\ell,T}^2 / \gamma^2 n)$, where $T$ is the number of gradient steps, $n$ is size of the training set, $\gamma$ is the data margin, and $r_{\ell,T}$ is a complexity term that depends on the (tail decay rate) of the loss function (and on $T$). Our upper bound matches the best known upper bounds due to Shamir (2021); Schliserman and Koren (2022), while extending their applicability to virtually any smooth loss function and relaxing technical assumptions they impose. Our risk lower bounds are the first in this context and establish the tightness of our upper bounds for any given tail decay rate and in all parameter regimes. The proof technique used to show these results is also markedly simpler compared to previous work, and is straightforward to extend to other gradient methods; we illustrate this by providing analogous results for Stochastic Gradient Descent.
Abstract（参考訳）: 分離可能な線形分類に適用した非正規化勾配法の一般化特性について検討し,sudry et al. (2018) の先駆的研究から注目されている。本設定では, 尾の減衰率で表される任意の滑らかな損失関数に対して, 勾配降下に対する密な上下(人口)リスク境界を確立する。私たちの境界は $\Theta(r_{\ell,T}^2 / \gamma^2 T + r_{\ell,T}^2 / \gamma^2 n)$, where $T$ is the number of gradient steps, $n$ is size of the training set, $\gamma$ is the data margin, $r_{\ell,T}$ は損失関数(および$T$)の(テール崩壊率)に依存する複雑性項である。私たちの上界は、shamir (2021)、schliserman and koren (2022) による最もよく知られた上界と一致し、その適用範囲は事実上どんな滑らかな損失関数にも拡張され、それらが課す技術的仮定も緩和される。我々のリスク低い境界は、この文脈で最初のものであり、与えられたテール崩壊率と全てのパラメーターレシエーションに対する上限の厳密性を確立する。これらの結果を示すために用いられる証明手法は、以前の研究に比べて著しく単純であり、他の勾配法にも容易に拡張できる。

関連論文リスト

Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification [27.33243506775655]
分離線形分類のための非正規化手法の一般化性能について検討する。収束速度は損失テンプレートの幾何に大きく影響されていることを示す。
論文参考訳（メタデータ） (2025-05-28T13:39:14Z)
Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates [15.27596975662702]
重み付き勾配を用いたDP最適化の最適速度を達成するアルゴリズムについて検討する。その結果,DP下での凸最適化の理論的限界が達成可能であることを示す。
論文参考訳（メタデータ） (2024-08-19T11:07:05Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
線形混合遷移カーネルを用いた最短経路(SSP)問題について検討する。エージェントは繰り返し環境と対話し、累積コストを最小化しながら特定の目標状態に到達する。既存の作業は、イテレーションコスト関数の厳密な下限や、最適ポリシーに対する期待長の上限を仮定することが多い。
論文参考訳（メタデータ） (2024-02-14T07:52:00Z)
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods [22.052459124774504]
本研究では,未調整のSGDに対する適応的手法により,スムーズさと情報優位性で問題を緩和することを示す。この結果から, 指数関数依存性が欠如している場合, 未修正SGDに対する適応手法の理論的正当性について検討した。
論文参考訳（メタデータ） (2023-05-21T14:40:43Z)
Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization [9.019243171993553]
トレーニングステップ$T$とStep-size$eta$は、滑らかな凸最適化(SCO)問題の認定に影響を与える可能性がある。まず、グラディエントDescent(GD)とグラディエントDescent(SGD)の厳密な過剰リスク低境界を提供する。近年の作業は、より良い速度で達成できるが、トレーニング時間が長い場合には改善が減少する。
論文参考訳（メタデータ） (2023-03-19T20:24:33Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
滑らかな凸凹凸結合型サドル点問題, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$ を考える。漸進的勾配指数(AG-EG)降下指数アルゴリズムについて述べる。
論文参考訳（メタデータ） (2022-06-17T06:10:20Z)
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails [55.561406656549686]
我々は、勾配推定が末尾を持つ可能性のある一階アルゴリズムを用いたヒルベルト非最適化を考える。本研究では, 勾配, 運動量, 正規化勾配勾配の収束を高確率臨界点に収束させることと, 円滑な損失に対する最もよく知られた繰り返しを示す。
論文参考訳（メタデータ） (2021-06-28T00:17:01Z)
Tight Second-Order Certificates for Randomized Smoothing [106.06908242424481]
また、ガウス的ランダムな滑らか化のための普遍曲率的境界が存在することを示す。この新たな証明書の正確性を証明することに加えて、SoS証明書は実現可能であり、したがって厳密であることを示す。
論文参考訳（メタデータ） (2020-10-20T18:03:45Z)
Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins [92.7662890047311]
勾配降下は、分類誤差$tilde O(mathsfOPT1/2) + varepsilon$ in $mathrmpoly(d,1/varepsilon)$ time and sample complexity.
論文参考訳（メタデータ） (2020-10-01T16:48:33Z)
Gradient Methods Never Overfit On Separable Data [31.714928102950584]
標準勾配法は分離可能なデータに過度に適合しないことを示す。データセットに対するマージン違反数の非漸近的境界を示す。
論文参考訳（メタデータ） (2020-06-30T18:01:46Z)
Carath\'eodory Sampling for Stochastic Gradient Descent [79.55586575988292]
本稿では,Tchakaloff と Carath'eodory の古典的な結果から着想を得た手法を提案する。我々は、測定値の低減を行う降下ステップを適応的に選択する。これをBlock Coordinate Descentと組み合わせることで、測定の削減を極めて安価に行えるようにします。
論文参考訳（メタデータ） (2020-06-02T17:52:59Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。