Fugu-MT 論文翻訳(概要): Accelerating Byzantine-Robust Distributed Learning with Compressed Communication via Double Momentum and Variance Reduction

論文の概要: Accelerating Byzantine-Robust Distributed Learning with Compressed Communication via Double Momentum and Variance Reduction

arxiv url: http://arxiv.org/abs/2603.15144v1
Date: Mon, 16 Mar 2026 11:39:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 18:28:58.179227
Title: Accelerating Byzantine-Robust Distributed Learning with Compressed Communication via Double Momentum and Variance Reduction
Title（参考訳）: 二重モーメントと分散化による圧縮通信によるビザンチン-ロバスト分散学習の高速化
Authors: Yanghao Li, Changxin Liu, Yuhao Yi,
Abstract要約: Byz-DM21はByzantine-robustと通信効率のよい分散学習アルゴリズムである。我々の重要な革新は、二重モーメント機構に基づく新しい勾配推定器である。 Byz-VR-DM21は各ノードの局所的な分散を減少させ、ランダム近似からの分散を段階的に除去する。
参考スコア（独自算出の注目度）: 15.213037403428082
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In collaborative and distributed learning, Byzantine robustness reflects a major facet of optimization algorithms. Such distributed algorithms are often accompanied by transmitting a large number of parameters, so communication compression is essential for an effective solution. In this paper, we propose Byz-DM21, a novel Byzantine-robust and communication-efficient stochastic distributed learning algorithm. Our key innovation is a novel gradient estimator based on a double-momentum mechanism, integrating recent advancements in error feedback techniques. Using this estimator, we design both standard and accelerated algorithms that eliminate the need for large batch sizes while maintaining robustness against Byzantine workers. We prove that the Byz-DM21 algorithm has a smaller neighborhood size and converges to $\varepsilon$-stationary points in $\mathcal{O}(\varepsilon^{-4})$ iterations. To further enhance efficiency, we introduce a distributed variant called Byz-VR-DM21, which incorporates local variance reduction at each node to progressively eliminate variance from random approximations. We show that Byz-VR-DM21 provably converges to $\varepsilon$-stationary points in $\mathcal{O}(\varepsilon^{-3 })$ iterations. Additionally, we extend our results to the case where the functions satisfy the Polyak-Łojasiewicz condition. Finally, numerical experiments demonstrate the effectiveness of the proposed method.
Abstract（参考訳）: 協調学習と分散学習では、ビザンチンの堅牢性は最適化アルゴリズムの大きな側面を反映している。このような分散アルゴリズムは、しばしば多数のパラメータを伝達するので、効果的な解には通信圧縮が不可欠である。本稿では,Byzantine-robustと通信効率の高い確率的分散学習アルゴリズムであるByz-DM21を提案する。我々の重要な革新は、二重モーメント機構に基づく新しい勾配推定器であり、最近のエラーフィードバック技術の統合である。この推定器を用いて、ビザンティン労働者に対するロバスト性を維持しつつ、大規模なバッチサイズの必要性を排除し、標準的なアルゴリズムと加速アルゴリズムの両方を設計する。 Byz-DM21アルゴリズムはより小さい近傍サイズを持ち、$\mathcal{O}(\varepsilon^{-4})$反復において$\varepsilon$-定常点に収束する。効率をさらに高めるために,各ノードにおける局所的な分散の低減を取り入れ,ランダム近似からの分散を段階的に除去するByz-VR-DM21という分散変種を導入する。 Byz-VR-DM21 は $\mathcal{O}(\varepsilon^{-3 })$ iterations において $\varepsilon$-stationary points に確実に収束することを示す。さらに、関数がポリアック・ジョジャシエヴィチ条件を満たす場合まで結果を拡張する。最後に,提案手法の有効性を示す数値実験を行った。

関連論文リスト

Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning [10.708457894356563]
ほぼ最適サンプル複雑性を実現するアルゴリズムを2つ提案する。両アルゴリズムが最適なポリシを推定するために,$widetildeOleft(|mathbfS||mathbfA| t_mathrmmix2varepsilon-2right)のサンプル複雑性が得られることを証明した。これはDR平均逆強化学習における最初の有限サンプル収束保証である。
論文参考訳（メタデータ） (2025-05-15T06:42:25Z)
Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
離散部分モジュラー問題を連続的に最適化するために,$textbfMA-OSMA$アルゴリズムを提案する。また、一様分布を混合することによりKLの発散を効果的に活用する、プロジェクションフリーな$textbfMA-OSEA$アルゴリズムも導入する。我々のアルゴリズムは最先端OSGアルゴリズムによって提供される$(frac11+c)$-approximationを大幅に改善する。
論文参考訳（メタデータ） (2025-02-07T15:57:56Z)
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient [40.22217106270146]
ばらつき低減技術はサンプリングのばらつきを低減し、一階法(FO)とゼロ階法(ZO)の収束率を向上するように設計されている。複合最適化問題において、ZO法は、ランダム推定から導かれる座標ワイド分散と呼ばれる追加の分散に遭遇する。本稿では,ZPDVR法とZPDVR法を提案する。
論文参考訳（メタデータ） (2024-05-28T02:27:53Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
本稿では,円滑なベルマン作用素を持つ連続空間マルコフ決定過程(MDP)の一般クラスにおいて,$varepsilon$-optimal Policyを学習する問題を考察する。我々のソリューションの鍵となるのは、調和解析のアイデアに基づく新しい射影技術である。我々の結果は、連続空間 MDP における2つの人気と矛盾する視点のギャップを埋めるものである。
論文参考訳（メタデータ） (2024-05-10T09:58:47Z)
Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates [9.965047642855074]
ビザンチンの堅牢性は、特定の分散最適化問題に対するアルゴリズムの重要な特徴である。収束率を向上した新しいビザンチンロバスト圧縮法を提案する。また,通信圧縮誤差フィードバックを用いたByzantine-robust法を開発した。
論文参考訳（メタデータ） (2023-10-15T11:22:34Z)
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
高精度リコール曲線(AUPRC)に基づく領域の最適化について検討し,不均衡なタスクに広く利用されている。我々は、$O (1/epsilon4)$のより優れた反復による、$epsilon$定常解を見つけるための新しい運動量法を開発する。また,O(1/epsilon4)$と同じ複雑さを持つ適応手法の新たなファミリを設計し,実際により高速な収束を享受する。
論文参考訳（メタデータ） (2021-07-02T16:21:52Z)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
敵対的レジリエントな分散最適化。機械は独立して勾配を計算し協力することができます私達のアルゴリズムは新しい集中の技術およびサンプル複雑性に基づいています。それは非常に実用的です:それはないときすべての前の方法の性能を改善します。セッティングマシンがあります。
論文参考訳（メタデータ） (2020-12-28T17:19:32Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
本稿では,各ステップごとに1つのデータポイントしか必要としない2つの単一スケールシングルループアルゴリズムを提案する。本研究の結果は, 同時一次および二重側収束の形で表される。
論文参考訳（メタデータ） (2020-08-23T20:36:49Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
本稿では,ニューラルネットワークを用いた大規模AUCのための分散変数について検討する。我々のモデルは通信ラウンドをはるかに少なくし、理論上はまだ多くの通信ラウンドを必要としています。いくつかのデータセットに対する実験は、我々の理論の有効性を示し、我々の理論を裏付けるものである。
論文参考訳（メタデータ） (2020-05-05T18:08:23Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。