Fugu-MT 論文翻訳(概要): Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning

論文の概要: Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning

arxiv url: http://arxiv.org/abs/2210.00690v1
Date: Mon, 3 Oct 2022 02:40:32 GMT
ステータス: 翻訳完了
システム内更新日: 2022-10-04 14:34:31.868475
Title: Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning
Title（参考訳）: フェデレーション学習における"heavier-tailed"と"neil infinite variance"の共振音
Authors: Haibo Yang, Peiwen Qiu, Jia Liu
Abstract要約: FLアルゴリズムの収束解析に関するほとんどの既存の研究において重要な仮定は、一階情報のノイズは有限分散であるということである。今のところ、ノイズの少ないFL系に対して収束アルゴリズムを適用できるかどうかは不明だ。
参考スコア（独自算出の注目度）: 7.425402784031433
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A key assumption in most existing works on FL algorithms' convergence analysis is that the noise in stochastic first-order information has a finite variance. Although this assumption covers all light-tailed (i.e., sub-exponential) and some heavy-tailed noise distributions (e.g., log-normal, Weibull, and some Pareto distributions), it fails for many fat-tailed noise distributions (i.e., ``heavier-tailed'' with potentially infinite variance) that have been empirically observed in the FL literature. To date, it remains unclear whether one can design convergent algorithms for FL systems that experience fat-tailed noise. This motivates us to fill this gap in this paper by proposing an algorithmic framework called FAT-Clipping (\ul{f}ederated \ul{a}veraging with \ul{t}wo-sided learning rates and \ul{clipping}), which contains two variants: FAT-Clipping per-round (FAT-Clipping-PR) and FAT-Clipping per-iteration (FAT-Clipping-PI). Specifically, for the largest $\alpha \in (1,2]$ such that the fat-tailed noise in FL still has a bounded $\alpha$-moment, we show that both variants achieve $\mathcal{O}((mT)^{\frac{2-\alpha}{\alpha}})$ and $\mathcal{O}((mT)^{\frac{1-\alpha}{3\alpha-2}})$ convergence rates in the strongly-convex and general non-convex settings, respectively, where $m$ and $T$ are the numbers of clients and communication rounds. Moreover, at the expense of more clipping operations compared to FAT-Clipping-PR, FAT-Clipping-PI further enjoys a linear speedup effect with respect to the number of local updates at each client and being lower-bound-matching (i.e., order-optimal). Collectively, our results advance the understanding of designing efficient algorithms for FL systems that exhibit fat-tailed first-order oracle information.
Abstract（参考訳）: FLアルゴリズムの収束解析に関するほとんどの既存の研究において鍵となる仮定は、確率的一階情報のノイズは有限分散であるということである。この仮定は、すべての光尾(サブ指数)と重い尾のノイズ分布(対数正規分布、ワイブル分布、パレート分布など)をカバーしているが、FL文献で実証的に観察された多くの脂肪尾のノイズ分布(すなわち、潜在的に無限のばらつきを持つ'heavier-tailed')に対して失敗する。今のところ、脂肪尾ノイズを経験するFLシステムの収束アルゴリズムを設計できるかどうかは不明だ。これは、FAT-Clipping per-round (FAT-Clipping-PR)とFAT-Clipping per-iteration (FAT-Clipping-PI)の2つの変種を含む、FAT-Clipping (\ul{f}ederated \ul{a}veraging with \ul{t}wo-sided learning rate and \ul{clipping})と呼ばれるアルゴリズムフレームワークを提案することによって、このギャップを埋める動機となっている。具体的には、FLの太字ノイズがまだ有界な$\alpha$-momentを持つ最大の$\alpha \in (1,2]$に対して、両方の変種が$\mathcal{O}((mT)^{\frac{2-\alpha}{\alpha}})$と$\mathcal{O}((mT)^{\frac{1-\alpha}{3\alpha-2}})$の収束率をそれぞれ強凸および一般の非凸設定で達成し、$m$と$T$がクライアントおよび通信ラウンドの数値であることを示す。さらに,fat-clipping-prよりもクリッピング操作が多くなると,fat-clipping-piは各クライアントの局所更新数に対して線形速度アップ効果を享受し,低バウンドマッチング(すなわち順序最適化)となる。総じて,fat-tailed first-order oracle 情報を示す fl システムの効率的なアルゴリズム設計の理解を深めた。

関連論文リスト

Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises [55.43924214633558]
本稿では,サブワイブルノイズとSsBCノイズの2種類のノイズに着目した。これら2つのノイズ仮定の下では、凸最適化と滑らかな最適化の文脈において、SFOMの不規則および高確率収束が研究されている。
論文参考訳（メタデータ） (2025-07-17T16:48:45Z)
Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness [74.18546828528298]
SignSGD with Majority Votingは,Kappakappakappa-1right,Kappakappakappa-1right,Kappakappakappa-1right,Kappakappakappa-1right,Kappaka ppakappa-1right,Kappakappakappa-1right,Kappakappakappa-1right,Kappakappakappa-1right,Kappakappakappa -1right,Kappakappakappa-1right,Kappakappakappa-1right,Kappakappappapa-1right,Kappaを用いて,複雑性の全範囲で堅牢に動作することを示す。
論文参考訳（メタデータ） (2025-02-11T19:54:11Z)
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping [21.865728815935665]
重み付き雑音下での最初の収束を提供するが、切断はしない。また、テールインデックス$mathfrakp$が事前に不明な場合には、最初の$mathcalO(Tfrac1-mathfrakp3mathfrakp-2)$収束率も設定する。
論文参考訳（メタデータ） (2024-12-27T08:46:46Z)
Theoretical limits of descending $\ell_0$ sparse-regression ML algorithms [0.0]
本研究では,emphmaximum-likelihood (ML)デコーディングの性能解析プログラムを開発した。 ML性能パラメータの鍵となるのは、残留エンフェロ平均二乗誤差(textbfRMSE$)を発見し、いわゆるエンフェロ遷移(PT)現象を示す。 Fl RDTの具体的実装と実用的妥当性は、典型的には、基礎となる数値評価のサイズのセットを実行する能力に依存している。
論文参考訳（メタデータ） (2024-10-10T06:33:41Z)
Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
構造密度の重み付き雑音によるクリップ最適化問題を考察する。勾配が有限の順序モーメントを持つとき、$mathcalO(K-(alpha - 1)/alpha)$よりも高速な収束率が得られることを示す。得られた推定値が無視可能なバイアスと制御可能な分散を持つことを示す。
論文参考訳（メタデータ） (2023-11-07T17:39:17Z)
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees [60.571030754252824]
複数のプロセッサ/ワーカー/クライアントがローカルなデュアルベクトルにアクセス可能なマルチGPU設定において、モノトン変分不等式(VI)問題を考察する。モノトーンVI問題に対するデファクトアルゴリズムであるExtra-gradientは、通信効率が良くないように設計されている。そこで本稿では,VI の解法に適した非バイアスで適応的な圧縮手法である量子化一般化外部勾配 (Q-GenX) を提案する。
論文参考訳（メタデータ） (2023-08-17T21:15:04Z)
Random quantum circuits transform local noise into global white noise [118.18170052022323]
低忠実度状態におけるノイズランダム量子回路の測定結果の分布について検討する。十分に弱くユニタリな局所雑音に対して、一般的なノイズ回路インスタンスの出力分布$p_textnoisy$間の相関(線形クロスエントロピーベンチマークで測定)は指数関数的に減少する。ノイズが不整合であれば、出力分布は、正確に同じ速度で均一分布の$p_textunif$に近づく。
論文参考訳（メタデータ） (2021-11-29T19:26:28Z)
STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning [58.6792963686231]
フェデレートラーニング(FL)とは、複数のワーカノード(WN)がローカルデータを用いてジョイントモデルを構築するパラダイムを指す。 WNの最小更新方向、最初のミニバッチサイズ、ローカル更新頻度をどうやって選択するかは明らかになっていない。局所的な更新頻度と局所的なミニサイズとの間にはトレードオフ曲線があることを示し、上記の複雑さを維持できることを示す。
論文参考訳（メタデータ） (2021-06-19T06:13:45Z)
CFedAvg: Achieving Efficient Communication and Fast Convergence in Non-IID Federated Learning [8.702106020664612]
フェデレートラーニング(Federated Learning, FL)は、多くの労働者がトレーニングデータを共有せずにモデルを共同で学習する分散ラーニングパラダイムである。 FLでは、ディープラーニング(ディープ)学習モデルと帯域幅接続によって高い通信コストが発生する可能性がある。本研究では,非バイアスのSNR制約圧縮機を用いたFL用分散通信データセットCFedAvgを紹介する。
論文参考訳（メタデータ） (2021-06-14T04:27:19Z)
Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMOは最初の(一階)FLtexttFedGLOMOアルゴリズムです。クライアントとサーバ間の通信においても,我々のアルゴリズムは確実に最適である。
論文参考訳（メタデータ） (2020-12-07T21:05:31Z)
Learning Halfspaces with Tsybakov Noise [50.659479930171585]
テュバコフ雑音の存在下でのハーフスペースの学習可能性について検討する。真半空間に関して誤分類誤差$epsilon$を達成するアルゴリズムを与える。
論文参考訳（メタデータ） (2020-06-11T14:25:02Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。