Fugu-MT 論文翻訳(概要): SYNTHESIS: A Semi-Asynchronous Path-Integrated Stochastic Gradient Method for Distributed Learning in Computing Clusters

論文の概要: SYNTHESIS: A Semi-Asynchronous Path-Integrated Stochastic Gradient Method for Distributed Learning in Computing Clusters

arxiv url: http://arxiv.org/abs/2208.08425v1
Date: Wed, 17 Aug 2022 17:42:33 GMT
ステータス: 翻訳完了
システム内更新日: 2022-08-18 13:08:33.956305
Title: SYNTHESIS: A Semi-Asynchronous Path-Integrated Stochastic Gradient Method for Distributed Learning in Computing Clusters
Title（参考訳）: 計算クラスタにおける分散学習のための半同期パス積分確率勾配法
Authors: Zhuqing Liu, Xin Zhang, Jia Liu
Abstract要約: ulstochastic gradulient ulsearchは、同期および非同期分散学習アルゴリズムの制限を克服するために開発された。 algnameアルゴリズムは(O(sqrtNepsilon-2(Delta+1) d+N))と(O(sqrtNepsilon-2(+1) d+N))を持つ (エプシロン)分散共有メモリアーキテクチャにおける非デルタ学習の定常点
参考スコア（独自算出の注目度）: 7.968142741470549
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To increase the training speed of distributed learning, recent years have witnessed a significant amount of interest in developing both synchronous and asynchronous distributed stochastic variance-reduced optimization methods. However, all existing synchronous and asynchronous distributed training algorithms suffer from various limitations in either convergence speed or implementation complexity. This motivates us to propose an algorithm called \algname (\ul{s}emi-as\ul{yn}chronous pa\ul{th}-int\ul{e}grated \ul{s}tochastic grad\ul{i}ent \ul{s}earch), which leverages the special structure of the variance-reduction framework to overcome the limitations of both synchronous and asynchronous distributed learning algorithms, while retaining their salient features. We consider two implementations of \algname under distributed and shared memory architectures. We show that our \algname algorithms have $O(\sqrt{N}\epsilon^{-2}(\Delta+1)+N)$ and $O(\sqrt{N}\epsilon^{-2}(\Delta+1) d+N)$ computational complexities for achieving an $\epsilon$-stationary point in non-convex learning under distributed and shared memory architectures, respectively, where $N$ denotes the total number of training samples and $\Delta$ represents the maximum delay of the workers. Moreover, we investigate the generalization performance of \algname by establishing algorithmic stability bounds for quadratic strongly convex and non-convex optimization. We further conduct extensive numerical experiments to verify our theoretical findings
Abstract（参考訳）: 近年,分散学習の学習速度を向上させるために,同期型および非同期型分散確率分散推論最適化手法の開発に多大な関心が寄せられている。しかし、既存の同期および非同期分散トレーニングアルゴリズムは、収束速度または実装の複雑さに様々な制限を被っている。これは、分散分散学習アルゴリズムの同期性と非同期性の両方の制限を克服するために分散還元フレームワークの特別な構造を利用するアルゴリズムである \algname (\ul{s}emi-as\ul{yn}chronous pa\ul{th}-int\ul{e}grated \ul{s}tochastic grad\ul{i}ent \ul{s}earch) を提案する。本稿では,分散メモリアーキテクチャと共有メモリアーキテクチャの2つの実装について考察する。分散メモリアーキテクチャおよび共有メモリアーキテクチャ下での非凸学習における $\epsilon$-定常点を達成するための計算複雑性は、それぞれ $n$ はトレーニングサンプルの総数を表し、 $\delta$ は作業者の最大遅延を表す。さらに, 2次強い凸と非凸最適化のためのアルゴリズム的安定性境界を確立することにより, \algnameの一般化性能について検討する。理論的な結果を検証するために広範な数値実験も行います

関連論文リスト

Freya PAGE: First Optimal Time Complexity for Large-Scale Nonconvex Finite-Sum Optimization with Heterogeneous Asynchronous Computations [92.1840862558718]
実用的な分散システムでは、労働者は概して均質ではなく、非常に多様な処理時間を持つ。本稿では、任意に遅い計算を扱うための新しい並列手法Freyaを提案する。 Freyaは従来の手法と比較して,複雑性の保証が大幅に向上していることを示す。
論文参考訳（メタデータ） (2024-05-24T13:33:30Z)
Asynchronous Distributed Optimization with Delay-free Parameters [9.062164411594175]
本稿では,2つの分散アルゴリズム, Prox-DGD と DGD-ATC の非同期バージョンを開発し,無方向性ネットワーク上でのコンセンサス最適化問題を解く。代替アルゴリズムとは対照的に,我々のアルゴリズムは,遅延に依存しないステップサイズを用いて,同期アルゴリズムの固定点集合に収束することができる。
論文参考訳（メタデータ） (2023-12-11T16:33:38Z)
AsGrad: A Sharp Unified Analysis of Asynchronous-SGD Algorithms [45.90015262911875]
不均一な環境で分散SGDのための非同期型アルゴリズムを解析する。また,本分析の副産物として,ランダムなきついSGDのような勾配型アルゴリズムの保証を示す。
論文参考訳（メタデータ） (2023-10-31T13:44:53Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
信頼領域(TR)と立方体を用いた適応正則化は、非常に魅力的な理論的性質を持つことが証明されている。 TR法とARC法はヘッセン関数,勾配関数,関数値の非コンパクトな計算を同時に行うことができることを示す。
論文参考訳（メタデータ） (2023-10-18T10:29:58Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
我々は、制約付き最適化のための一階アルゴリズムと非滑らかなシステムの間で、新しい一階アルゴリズムのクラスを設計する。これらのアルゴリズムの重要な性質は、制約がスパース変数の代わりに速度で表されることである。
論文参考訳（メタデータ） (2023-02-01T08:50:48Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
通信周波数の異なる分散計算作業者のトレーニングアルゴリズムを示す。本研究では,より厳密な収束率を$mathcalO!!(sigma2-2_avg!)とする。また,不均一性の項は,作業者の平均遅延によっても影響されることを示した。
論文参考訳（メタデータ） (2022-06-16T17:10:57Z)
DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization [77.34726150561087]
我々は,分散最適化問題に対する新しい手法であるDASHAを開発し,解析する。 MARINAとは異なり、新しいDASHAとDASHA-MVRは圧縮ベクターのみを送信し、ノードを同期しないため、学習をより実用的なものにしている。
論文参考訳（メタデータ） (2022-02-02T20:10:40Z)
Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees [10.984101749941471]
並列および分散最適化アルゴリズムの解析に現れる非同期反復に対する新しい収束結果を紹介する。結果は簡単に適用でき、非同期の度合いが反復の収束率にどのように影響するかを明確に見積もることができる。
論文参考訳（メタデータ） (2021-09-09T19:08:56Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
本稿では,各ステップごとに1つのデータポイントしか必要としない2つの単一スケールシングルループアルゴリズムを提案する。本研究の結果は, 同時一次および二重側収束の形で表される。
論文参考訳（メタデータ） (2020-08-23T20:36:49Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。