Fugu-MT 論文翻訳(概要): Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction

論文の概要: Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction

arxiv url: http://arxiv.org/abs/2406.00489v2
Date: Wed, 23 Oct 2024 14:42:35 GMT
ステータス: 翻訳完了
システム内更新日: 2024-11-28 17:07:33.149112
Title: Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction
Title（参考訳）: 効率的な符号ベース最適化:可変化による収束の高速化
Authors: Wei Jiang, Sifan Yang, Wenhao Yang, Lijun Zhang,
Abstract要約: それぞれ$mathcalO(d1/2T-1/2 + dn-1/2)$と$mathcalO(d1/4T-1/4)$の収束率を改善する2つの新しいアルゴリズムを導入する。提案手法の有効性を検証した数値実験を行った。
参考スコア（独自算出の注目度）: 16.82220229840038
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sign stochastic gradient descent (signSGD) is a communication-efficient method that transmits only the sign of stochastic gradients for parameter updating. Existing literature has demonstrated that signSGD can achieve a convergence rate of $\mathcal{O}(d^{1/2}T^{-1/4})$, where $d$ represents the dimension and $T$ is the iteration number. In this paper, we improve this convergence rate to $\mathcal{O}(d^{1/2}T^{-1/3})$ by introducing the Sign-based Stochastic Variance Reduction (SSVR) method, which employs variance reduction estimators to track gradients and leverages their signs to update. For finite-sum problems, our method can be further enhanced to achieve a convergence rate of $\mathcal{O}(m^{1/4}d^{1/2}T^{-1/2})$, where $m$ denotes the number of component functions. Furthermore, we investigate the heterogeneous majority vote in distributed settings and introduce two novel algorithms that attain improved convergence rates of $\mathcal{O}(d^{1/2}T^{-1/2} + dn^{-1/2})$ and $\mathcal{O}(d^{1/4}T^{-1/4})$ respectively, outperforming the previous results of $\mathcal{O}(dT^{-1/4} + dn^{-1/2})$ and $\mathcal{O}(d^{3/8}T^{-1/8})$, where $n$ represents the number of nodes. Numerical experiments across different tasks validate the effectiveness of our proposed methods.
Abstract（参考訳）: 符号確率勾配降下 (signSGD) は、パラメータ更新のための確率勾配の符号のみを送信する通信効率のよい方法である。既存の文献では、符号SGDは$\mathcal{O}(d^{1/2}T^{-1/4})$の収束率を達成でき、$d$は次元を表し、$T$は反復数である。本稿では、この収束率を$\mathcal{O}(d^{1/2}T^{-1/3})$に改善し、SSVR(Sign-based Stochastic Variance Reduction)法を導入する。有限サム問題に対しては、m$は成分関数の数を表す$\mathcal{O}(m^{1/4}d^{1/2}T^{-1/2})$の収束率を達成するためにさらに拡張することができる。さらに、分散環境での不均一な多数決を調査し、$\mathcal{O}(d^{1/2}T^{-1/2} + dn^{-1/2})$と$\mathcal{O}(d^{1/4}T^{-1/4})$の収束率を改善する2つの新しいアルゴリズムを導入し、$\mathcal{O}(dT^{-1/4} + dn^{-1/2})$と$\mathcal{O}(d^{3/8}T^{-1/8})$の前の結果をそれぞれ上回り、$n$はノード数を表す。提案手法の有効性を検証した数値実験を行った。

関連論文リスト

Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping [21.865728815935665]
重み付き雑音下での最初の収束を提供するが、切断はしない。また、テールインデックス$mathfrakp$が事前に不明な場合には、最初の$mathcalO(Tfrac1-mathfrakp3mathfrakp-2)$収束率も設定する。
論文参考訳（メタデータ） (2024-12-27T08:46:46Z)
Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
複素勾配問題に対する2つの分散化ZO推定器を提案する。我々は、現在最先端の機能複雑性を$mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$から$tildecalOleft(fracdepsilon2right)$に改善する。
論文参考訳（メタデータ） (2024-10-03T15:04:01Z)
Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
そこで本研究では,Nesterovの高速前方反射法と分散還元法を新たに提案し,根絶問題の解法を提案する。我々のアルゴリズムは単ループであり、ルートフィリング問題に特化して設計された非バイアス分散還元推定器の新たなファミリーを利用する。
論文参考訳（メタデータ） (2024-06-04T15:23:29Z)
On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [59.65871549878937]
本稿では、RMSPropとその運動量拡張を考察し、$frac1Tsum_k=1Tの収束速度を確立する。我々の収束率は、次元$d$を除くすべての係数に関して下界と一致する。収束率は$frac1Tsum_k=1Tと類似していると考えられる。
論文参考訳（メタデータ） (2024-02-01T07:21:32Z)
Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality [0.0]
我々は、$symbol$k$収束問題に対して、新しいマップベースのアルゴリズム(mathsfnorMtext-mathsfSGD$)を提案する。
論文参考訳（メタデータ） (2023-05-10T01:12:11Z)
High Probability Convergence of Stochastic Gradient Methods [15.829413808059124]
最適解への初期距離に依存する有界収束を示す。 AdaGrad-Normのハイバウンドが得られることを示す。
論文参考訳（メタデータ） (2023-02-28T18:42:11Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
通信周波数の異なる分散計算作業者のトレーニングアルゴリズムを示す。本研究では,より厳密な収束率を$mathcalO!!(sigma2-2_avg!)とする。また,不均一性の項は,作業者の平均遅延によっても影響されることを示した。
論文参考訳（メタデータ） (2022-06-16T17:10:57Z)
A New Framework for Variance-Reduced Hamiltonian Monte Carlo [88.84622104944503]
分散還元型ハミルトン・モンテカルロ法 (HMC) の新たなフレームワークを提案し,$L$-smooth および $m$-strongly log-concave 分布からサンプリングする。本研究では,SAGA法やSVRG法をベースとした非バイアス勾配推定器を用いて,バッチサイズを小さくすることで,高い勾配効率が得られることを示す。総合的および実世界のベンチマークデータによる実験結果から、我々の新しいフレームワークは、完全な勾配と勾配HMCアプローチを著しく上回っていることが示された。
論文参考訳（メタデータ） (2021-02-09T02:44:24Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
非同期Q-ラーニングはマルコフ決定過程(MDP)の最適行動値関数(またはQ-関数)を学習することを目的としている。 Q-関数の入出力$varepsilon$-正確な推定に必要なサンプルの数は、少なくとも$frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$の順である。
論文参考訳（メタデータ） (2020-06-04T17:51:00Z)
On the Minimax Optimality of the EM Algorithm for Learning Two-Component Mixed Linear Regression [30.09238179576628]
信号-雑音比(SNR)の全ての条件下での2成分混合線形回帰学習のためのEMアルゴリズムの収束率について検討する。 EMアルゴリズムは,すべてのSNR条件下で,最小限のサンプル複雑性を実現する。
論文参考訳（メタデータ） (2020-06-04T00:56:07Z)
A Simple Convergence Proof of Adam and Adagrad [74.24716715922759]
我々はAdam Adagradと$O(d(N)/st)$アルゴリズムの収束の証明を示す。 Adamはデフォルトパラメータで使用する場合と同じ収束$O(d(N)/st)$で収束する。
論文参考訳（メタデータ） (2020-03-05T01:56:17Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。