Fugu-MT 論文翻訳(概要): An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning

論文の概要: An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning

arxiv url: http://arxiv.org/abs/2510.26709v3
Date: Tue, 04 Nov 2025 07:21:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-05 14:27:17.378172
Title: An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning
Title（参考訳）: 通信効率の良い分散学習のための全再生対応Top-K圧縮機
Authors: Chuyan Chen, Chenyang Ma, Zhangxin Li, Yutong He, Yanjie Dong, Kun Yuan,
Abstract要約: 勾配圧縮機Rand-K$は構造情報を破棄し、収縮が不十分である。 Top-K$は情報的エントリを保存するが、プロパティを失い、コストがかかるAll-Gather操作が必要になる。 ARC-Top-$K$は、勾配の軽量なスケッチを使用してノード間の間隔パターンを整列し、インデックスなしのAll-Reduceを可能にする。
参考スコア（独自算出の注目度）: 13.41238196525377
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Communication remains a central bottleneck in large-scale distributed machine learning, and gradient sparsification has emerged as a promising strategy to alleviate this challenge. However, existing gradient compressors face notable limitations: Rand-$K$ discards structural information and performs poorly in practice, while Top-$K$ preserves informative entries but loses the contraction property and requires costly All-Gather operations. In this paper, we propose ARC-Top-$K$, an {All-Reduce}-Compatible Top-$K$ compressor that aligns sparsity patterns across nodes using a lightweight sketch of the gradient, enabling index-free All-Reduce while preserving globally significant information. ARC-Top-$K$ is provably contractive and, when combined with momentum error feedback (EF21M), achieves linear speedup and sharper convergence rates than the original EF21M under standard assumptions. Empirically, ARC-Top-$K$ matches the accuracy of Top-$K$ while reducing wall-clock training time by up to 60.7\%, offering an efficient and scalable solution that combines the robustness of Rand-$K$ with the strong performance of Top-$K$.
Abstract（参考訳）: コミュニケーションは、大規模な分散機械学習において依然として中心的なボトルネックであり、この課題を緩和するための有望な戦略として、勾配のスパーシフィケーションが出現している。 Rand-$K$は構造情報を破棄し、実際は性能が悪く、Top-$K$は情報的エントリを保存するが、収縮特性を失い、コストがかかるオールギャザー演算を必要とする。本稿では,グローバルな重要な情報を保存しながら,指数なしのAll-Reduceを実現するために,勾配の軽量なスケッチを用いてノード間の空間パターンを整列するARC-Top-$K$, {All-Reduce}-Compatible Top-$K$圧縮機を提案する。 ARC-Top-$K$は実効性があり、運動量誤差フィードバック(EF21M)と組み合わせると、標準仮定の下で元のEF21Mよりも線形のスピードアップとよりシャープな収束率を達成する。経験的に、ARC-Top-$K$はTop-K$の精度と一致し、壁時計のトレーニング時間を最大60.7\%削減し、Rand-K$の堅牢性とTop-K$の強力なパフォーマンスを組み合わせた、効率的でスケーラブルなソリューションを提供する。

論文の概要: An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning

関連論文リスト