Fugu-MT 論文翻訳(概要): Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness

論文の概要: Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness

arxiv url: http://arxiv.org/abs/2509.08726v2
Date: Fri, 26 Sep 2025 05:08:27 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 18:47:02.690645
Title: Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness
Title（参考訳）: リラクシド・スムースネス下における分散確率的非凸最適化
Authors: Luo Luo, Xue Cui, Tingkai Jia, Cheng Chen,
Abstract要約: 分散正規化勾配勾配(DNS)という新しいアルゴリズムを提案する。 DNSは各ローカルエージェントで$bold$ilonポイントを達成することができる。以上より, $mathcal O(m-1(L_fsigma2Delta_fepsilon-4 + sigma2epsilon-2 + L_f-1)$ per agent。
参考スコア（独自算出の注目度）: 21.090579632247707
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper studies decentralized optimization problem $f(\mathbf{x})=\frac{1}{m}\sum_{i=1}^m f_i(\mathbf{x})$, where each local function has the form of $f_i(\mathbf{x}) = {\mathbb E}\left[F(\mathbf{x};{\boldsymbol \xi}_i)\right]$ which is $(L_0,L_1)$-smooth but possibly nonconvex and the random variable ${\boldsymbol \xi}_i$ follows distribution ${\mathcal D}_i$. We propose a novel algorithm called decentralized normalized stochastic gradient descent (DNSGD), which can achieve an $\epsilon$-stationary point at each local agent. We present a new framework for analyzing decentralized first-order methods in the relaxed smooth setting, based on the Lyapunov function related to the product of the gradient norm and the consensus error. We show the upper bounds on the sample complexity of ${\mathcal O}(m^{-1}(L_f\sigma^2\Delta_f\epsilon^{-4} + \sigma^2\epsilon^{-2} + L_f^{-2}L_1^3\sigma^2\Delta_f\epsilon^{-1} + L_f^{-2}L_1^2\sigma^2))$ per agent and the communication complexity of $\tilde{\mathcal O}((L_f\epsilon^{-2} + L_1\epsilon^{-1})\gamma^{-1/2}\Delta_f)$, where $L_f=L_0 +L_1\zeta$, $\sigma^2$ is the variance of the stochastic gradient, $\Delta_f$ is the initial optimal function value gap, $\gamma$ is the spectral gap of the network, and $\zeta$ is the degree of the gradient dissimilarity. In the special case of $L_1=0$, the above results (nearly) match the lower bounds of decentralized stochastic nonconvex optimization under the standard smoothness. We also conduct numerical experiments to show the empirical superiority of our method.
Abstract（参考訳）: 本稿では、分散最適化問題 $f(\mathbf{x})=\frac{1}{m}\sum_{i=1}^m f_i(\mathbf{x})$, ここで各局所関数は、$f_i(\mathbf{x}) = {\mathbb E}\left[F(\mathbf{x};{\boldsymbol \xi}_i)\right]$が$(L_0,L_1)$-smoothであるが、おそらくは非凸であり、確率変数 ${\boldsymbol \xi}_i$ は${\mathcal D}_i$ に従う。本稿では,分散正規化確率勾配降下法(DNSGD)と呼ばれる新しいアルゴリズムを提案する。本稿では、勾配ノルムの積とコンセンサス誤差に関連するリャプノフ関数に基づいて、緩和されたスムーズな設定で分散化された一階法を解析するための新しい枠組みを提案する。我々は、${\mathcal O}(L_f\sigma^2\Delta_f\epsilon^{-4} + \sigma^2\epsilon^{-2} + L_f^{-2}L_1^3\Delta_f\epsilon^{-1} + L_f^{-2}L_1^2\sigma^2)$のサンプル複雑性と$\tilde{\mathcal O}((L_f\epsilon^{-2} + L_1\epsilon^{-1})\gamma^{-1/2}\Delta_f)$$L_f=L_0 + L_1\zeta$$=$2\sigma^2\Delta_f} + L_f^{-2} + L_f^{-2}, L_f^{-2}L_f^{-2}L_1\epsilon^{-1})$の通信複雑性を示す。 L_1=0$の特別の場合、上記の結果は(ほぼ)標準滑らかさの下での分散確率的非凸最適化の下限と一致する。また,本手法の実証的優位性を示す数値実験を行った。

関連論文リスト

Near-Optimal Convergence of Accelerated Gradient Methods under Generalized and $(L_0, L_1)$-Smoothness [57.93371273485736]
我々は、最近提案された$ell$-smoothness条件$|nabla2f(x)|| le ellleft(||nabla f(x)||right),$$$L$-smoothnessと$(L_0,L_1)$-smoothnessを一般化する関数を持つ凸最適化問題の一階法について検討する。
論文参考訳（メタデータ） (2025-08-09T08:28:06Z)
Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction [57.93371273485736]
我々は、すべての労働者が同一の分布にアクセスする均質な(すなわちd.d.)場合であっても、すべての労働者が非バイアス付き境界 LDeltaepsilon2,$$$$$ のポリ対数的により良いポリ対数を求める集中型分散学習環境を考える。
論文参考訳（メタデータ） (2025-06-30T13:27:39Z)
Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity [22.615156512223763]
本研究では,有限サム構造を目的とする分散楽観的すべり(SVOGS)法を提案する。我々は$mathcal O(delta D2/varepsilon)$、$mathcal O(n+sqrtndelta D2/varepsilon)$、$tildemathcal O(n+sqrtndelta+L)D2/varepsilon)$の局所呼び出しを証明している。
論文参考訳（メタデータ） (2024-05-25T08:34:49Z)
Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization [20.54801745090522]
我々は、mathbbRd f(x) 三角形q mathbbE_xi [Fxi]$inf(x)$ Lipschitz における $min_x という形式の問題を考察する。最近提案された勾配なし法は、少なくとも$mathcalO(L4 d3/2 epsilon-4 + Goldstein L d3/2 delta-1 epsilon-4)$ 0次複雑性を必要とする。
論文参考訳（メタデータ） (2023-01-16T13:33:37Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
本稿では,BFG凸最適化(SCO: Reweighted Query (ReSQue) 推定ツールを提案する。我々はSCOの並列およびプライベート設定における最先端の複雑さを実現するアルゴリズムを開発した。
論文参考訳（メタデータ） (2023-01-01T18:51:29Z)
On the Complexity of Decentralized Smooth Nonconvex Finite-Sum Optimization [21.334985032433778]
分散最適化問題 $min_bf xinmathbb Rd f(bf x)triq frac1msum_i=1m f_i(bf x)triq frac1nsum_j=1n。
論文参考訳（メタデータ） (2022-10-25T11:37:11Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity [121.83085611327654]
積 $r:=p + q$, ここで$r$は$mu$-strong convex類似性である。エージェントの通信やローカルコールにマスターされた問題を解決する方法を提案する。提案手法は$mathcalO(sqrtL_q/mu)$法よりもはるかにシャープである。
論文参考訳（メタデータ） (2022-05-30T14:28:02Z)
Decentralized Stochastic Variance Reduced Extragradient Method [25.21457349137344]
本稿では,$min_xmax_y fx,y triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumiの分散凸-凹極小最適化問題を考察する。
論文参考訳（メタデータ） (2022-02-01T16:06:20Z)
A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
凸凹対象 $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, ここで、$f$ は局所リプシッツ勾配を持つ凸関数であり、$g$ は凸かつ非滑らかである。主勾配ステップと2段ステップを交互に交互に行うCondat-Vuアルゴリズムの適応バージョンを提案する。
論文参考訳（メタデータ） (2021-10-28T14:19:30Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
我々は、$bf K$ の固有スペクトルが$bf w$ の i.d. 成分の分布とは独立であることを示す。 3次ランダム特徴(TRF)と呼ばれる新しいランダム手法を提案する。提案したランダムな特徴の計算には乗算が不要であり、古典的なランダムな特徴に比べてストレージに$b$のコストがかかる。
論文参考訳（メタデータ） (2021-10-05T09:33:49Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
期待される正方形損失から、最も適合した単一ニューロンを学習することの問題点を考察する。 ReLUアクティベーションでは、我々の人口リスク保証は$O(mathsfOPT1/2)+epsilon$である。 ReLUアクティベーションでは、我々の人口リスク保証は$O(mathsfOPT1/2)+epsilon$である。
論文参考訳（メタデータ） (2020-05-29T07:20:35Z)
Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems [36.645753881826955]
本稿では,分散を利用してより効率的に推定できるRecurEnti Ascent(SREDA)という新しい手法を提案する。この方法はこの問題でよく知られている。
論文参考訳（メタデータ） (2020-01-11T09:05:03Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。