Fugu-MT 論文翻訳(概要): Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises

論文の概要: Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises

arxiv url: http://arxiv.org/abs/2507.13283v1
Date: Thu, 17 Jul 2025 16:48:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-07-18 20:10:24.578579
Title: Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises
Title（参考訳）: 重音下における確率弱凸最適化
Authors: Tianxi Zhu, Yi Xu, Xiangyang Ji,
Abstract要約: 本稿では,サブワイブルノイズとSsBCノイズの2種類のノイズに着目した。これら2つのノイズ仮定の下では、凸最適化と滑らかな最適化の文脈において、SFOMの不規則および高確率収束が研究されている。
参考スコア（独自算出の注目度）: 55.43924214633558
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An increasing number of studies have focused on stochastic first-order methods (SFOMs) under heavy-tailed gradient noises, which have been observed in the training of practical deep learning models. In this paper, we focus on two types of gradient noises: one is sub-Weibull noise, and the other is noise under the assumption that it has a bounded $p$-th central moment ($p$-BCM) with $p\in (1, 2]$. The latter is more challenging due to the occurrence of infinite variance when $p\in (1, 2)$. Under these two gradient noise assumptions, the in-expectation and high-probability convergence of SFOMs have been extensively studied in the contexts of convex optimization and standard smooth optimization. However, for weakly convex objectives-a class that includes all Lipschitz-continuous convex objectives and smooth objectives-our understanding of the in-expectation and high-probability convergence of SFOMs under these two types of noises remains incomplete. We investigate the high-probability convergence of the vanilla stochastic subgradient descent (SsGD) method under sub-Weibull noises, as well as the high-probability and in-expectation convergence of clipped SsGD under the $p$-BCM noises. Both analyses are conducted in the context of weakly convex optimization. For weakly convex objectives that may be non-convex and non-smooth, our results demonstrate that the theoretical dependence of vanilla SsGD on the failure probability and number of iterations under sub-Weibull noises does not degrade compared to the case of smooth objectives. Under $p$-BCM noises, our findings indicate that the non-smoothness and non-convexity of weakly convex objectives do not impact the theoretical dependence of clipped SGD on the failure probability relative to the smooth case; however, the sample complexity we derived is worse than a well-known lower bound for smooth optimization.
Abstract（参考訳）: 実際のディープラーニングモデルのトレーニングにおいて観測された重尾勾配雑音下での確率的一階法(SFOM)に着目した研究が増えている。本稿では,2種類の勾配雑音に着目し,その1つはサブワイブル雑音であり,もう1つはp$-th Central moment (p$-BCM) と$p\in (1, 2]$を持つという仮定の下でノイズである。後者は、$p\in (1, 2)$ のときの無限分散の発生によりより困難である。これら2つの勾配雑音仮定の下では、凸最適化と標準滑らかな最適化の文脈において、SFOMの不規則および高確率収束が広く研究されている。しかし、全てのリプシッツ連続凸対象と滑らかな目的を含む弱凸対象に対しては、これらの2種類の雑音の下でのSFOMの観測外および高確率収束の理解は不完全である。サブワイブル雑音下でのバニラ確率下降降下法(SsGD)の高確率収束と,$p$-BCM雑音下でのクリッピングしたSsGDの高確率および非予測収束について検討した。どちらの解析も弱凸最適化の文脈で行われる。非凸かつ非平滑であるような弱凸対象に対しては,バニラSsGDの故障確率とサブワイブル雑音下での繰り返し数に対する理論的依存性は,滑らかな対象よりも劣化しないことを示す。 p$-BCMの雑音の下では、弱凸対象の非滑らかさと非凸性は、スムーズな場合と比較してクリッピングされたSGDの故障確率に対する理論的依存性には影響しないが、我々が引き起こしたサンプルの複雑さは、スムーズな最適化のためによく知られた下限よりも悪い。

論文の概要: Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises

関連論文リスト