Fugu-MT 論文翻訳(概要): Multi-Timescale Gradient Sliding for Distributed Optimization

論文の概要: Multi-Timescale Gradient Sliding for Distributed Optimization

arxiv url: http://arxiv.org/abs/2506.15387v1
Date: Wed, 18 Jun 2025 12:03:34 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-19 19:35:51.652426
Title: Multi-Timescale Gradient Sliding for Distributed Optimization
Title（参考訳）: 分散最適化のためのマルチ時間勾配スライディング
Authors: Junhui Zhang, Patrick Jaillet,
Abstract要約: 凸・非平滑・分散最適化問題に対する2つの一階法を提案する。我々のMT-GSとATT-GSは、通信ラウンドを減らすために、(局所的な)目的間の類似性を利用することができる。これら3つの望ましい特徴はブロック分解可能な原始双対の定式化によって達成される。
参考スコア（独自算出の注目度）: 23.311605203774388
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose two first-order methods for convex, non-smooth, distributed optimization problems, hereafter called Multi-Timescale Gradient Sliding (MT-GS) and its accelerated variant (AMT-GS). Our MT-GS and AMT-GS can take advantage of similarities between (local) objectives to reduce the communication rounds, are flexible so that different subsets (of agents) can communicate at different, user-picked rates, and are fully deterministic. These three desirable features are achieved through a block-decomposable primal-dual formulation, and a multi-timescale variant of the sliding method introduced in Lan et al. (2020), Lan (2016), where different dual blocks are updated at potentially different rates. To find an $\epsilon$-suboptimal solution, the complexities of our algorithms achieve optimal dependency on $\epsilon$: MT-GS needs $O(\overline{r}A/\epsilon)$ communication rounds and $O(\overline{r}/\epsilon^2)$ subgradient steps for Lipchitz objectives, and AMT-GS needs $O(\overline{r}A/\sqrt{\epsilon\mu})$ communication rounds and $O(\overline{r}/(\epsilon\mu))$ subgradient steps if the objectives are also $\mu$-strongly convex. Here, $\overline{r}$ measures the ``average rate of updates'' for dual blocks, and $A$ measures similarities between (subgradients of) local functions. In addition, the linear dependency of communication rounds on $A$ is optimal (Arjevani and Shamir 2015), thereby providing a positive answer to the open question whether such dependency is achievable for non-smooth objectives (Arjevani and Shamir 2015).
Abstract（参考訳）: 本稿では,マルチ時間勾配スライディング (MT-GS) と加速変種 (AMT-GS) という,凸・非平滑・分散最適化問題に対する2つの一階法を提案する。我々のMT-GSとAMT-GSは、通信ラウンドを減らすために(局所的な)目的の類似性を生かし、異なるサブセット(エージェントの)が異なるユーザ選択率で通信できるように柔軟であり、完全に決定論的である。これらの3つの望ましい特徴は、ブロック分解可能な原始双対の定式化と、Lan et al (2020), Lan (2016)で導入されたスライディング法のマルチスケール変種によって達成され、異なる2つのブロックが潜在的に異なる速度で更新される。 MT-GS need $O(\overline{r}A/\epsilon)$ communication rounds and $O(\overline{r}/\epsilon^2)$ subgradient steps for Lipchitz objectives and AMT-GS need $O(\overline{r}A/\sqrt{\epsilon\mu})$ communication rounds and $O(\overline{r}/(\epsilon\mu)$ subgradient steps if the objectives is $\mu-strong convex。ここで、$\overline{r}$は二重ブロックの ``average rate of updates'' を測り、$A$は局所関数の(下位の)類似度を測る。さらに、$A$上の通信ラウンドの線形依存性は最適である(ArjevaniとShamir 2015)。

関連論文リスト

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
離散部分モジュラー問題を連続的に最適化するために,$textbfMA-OSMA$アルゴリズムを提案する。また、一様分布を混合することによりKLの発散を効果的に活用する、プロジェクションフリーな$textbfMA-OSEA$アルゴリズムも導入する。我々のアルゴリズムは最先端OSGアルゴリズムによって提供される$(frac11+c)$-approximationを大幅に改善する。
論文参考訳（メタデータ） (2025-02-07T15:57:56Z)
MGDA Converges under Generalized Smoothness, Provably [27.87166415148172]
多目的最適化(MOO)はマルチタスク学習など様々な分野で注目を集めている。最近の研究は、理論解析を伴う効果的なアルゴリズムを提供しているが、それらは標準の$L$-smoothあるいは有界勾配仮定によって制限されている。一般化された$ell$-smooth損失関数のより一般的で現実的なクラスについて研究し、$ell$は勾配ノルムの一般非減少関数である。
論文参考訳（メタデータ） (2024-05-29T18:36:59Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
本稿では,円滑なベルマン作用素を持つ連続空間マルコフ決定過程(MDP)の一般クラスにおいて,$varepsilon$-optimal Policyを学習する問題を考察する。我々のソリューションの鍵となるのは、調和解析のアイデアに基づく新しい射影技術である。我々の結果は、連続空間 MDP における2つの人気と矛盾する視点のギャップを埋めるものである。
論文参考訳（メタデータ） (2024-05-10T09:58:47Z)
An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems [76.2042837251496]
相補的な合成条件に基づく2つの難解なミラー降下アルゴリズムを導入する。注目すべきは、どちらのアルゴリズムも、目的関数のリプシッツ定数や滑らかさに関する事前の知識なしで機能する。本稿では,大規模半確定プログラム上での手法の効率性とロバスト性を示す。
論文参考訳（メタデータ） (2023-06-30T08:34:29Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
本稿では,BFG凸最適化(SCO: Reweighted Query (ReSQue) 推定ツールを提案する。我々はSCOの並列およびプライベート設定における最先端の複雑さを実現するアルゴリズムを開発した。
論文参考訳（メタデータ） (2023-01-01T18:51:29Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
通信周波数の異なる分散計算作業者のトレーニングアルゴリズムを示す。本研究では,より厳密な収束率を$mathcalO!!(sigma2-2_avg!)とする。また,不均一性の項は,作業者の平均遅延によっても影響されることを示した。
論文参考訳（メタデータ） (2022-06-16T17:10:57Z)
Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity [121.83085611327654]
積 $r:=p + q$, ここで$r$は$mu$-strong convex類似性である。エージェントの通信やローカルコールにマスターされた問題を解決する方法を提案する。提案手法は$mathcalO(sqrtL_q/mu)$法よりもはるかにシャープである。
論文参考訳（メタデータ） (2022-05-30T14:28:02Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。