Fugu-MT 論文翻訳(概要): Compressed and distributed least-squares regression: convergence rates with applications to Federated Learning

論文の概要: Compressed and distributed least-squares regression: convergence rates with applications to Federated Learning

arxiv url: http://arxiv.org/abs/2308.01358v1
Date: Wed, 2 Aug 2023 18:02:00 GMT
ステータス: 翻訳完了
システム内更新日: 2023-08-04 16:17:03.945733
Title: Compressed and distributed least-squares regression: convergence rates with applications to Federated Learning
Title（参考訳）: 圧縮分散最小二乗回帰--フェデレート学習への応用との収束率
Authors: Constantin Philippenko and Aymeric Dieuleveut
Abstract要約: 機械学習の勾配アルゴリズムに対する圧縮の影響について検討する。いくつかの非バイアス圧縮演算子間の収束率の差を強調した。我々はその結果を連合学習の事例にまで拡張する。
参考スコア（独自算出の注目度）: 9.31522898261934
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we investigate the impact of compression on stochastic gradient algorithms for machine learning, a technique widely used in distributed and federated learning. We underline differences in terms of convergence rates between several unbiased compression operators, that all satisfy the same condition on their variance, thus going beyond the classical worst-case analysis. To do so, we focus on the case of least-squares regression (LSR) and analyze a general stochastic approximation algorithm for minimizing quadratic functions relying on a random field. We consider weak assumptions on the random field, tailored to the analysis (specifically, expected H\"older regularity), and on the noise covariance, enabling the analysis of various randomizing mechanisms, including compression. We then extend our results to the case of federated learning. More formally, we highlight the impact on the convergence of the covariance $\mathfrak{C}_{\mathrm{ania}}$ of the additive noise induced by the algorithm. We demonstrate despite the non-regularity of the stochastic field, that the limit variance term scales with $\mathrm{Tr}(\mathfrak{C}_{\mathrm{ania}} H^{-1})/K$ (where $H$ is the Hessian of the optimization problem and $K$ the number of iterations) generalizing the rate for the vanilla LSR case where it is $\sigma^2 \mathrm{Tr}(H H^{-1}) / K = \sigma^2 d / K$ (Bach and Moulines, 2013). Then, we analyze the dependency of $\mathfrak{C}_{\mathrm{ania}}$ on the compression strategy and ultimately its impact on convergence, first in the centralized case, then in two heterogeneous FL frameworks.
Abstract（参考訳）: 本稿では,分散学習やフェデレーション学習において広く用いられている機械学習の確率勾配アルゴリズムに対する圧縮の影響について検討する。いくつかの非バイアス圧縮演算子間の収束率の差は、その分散に関して同じ条件を満たすため、古典的な最悪のケース解析を超越する。そのため、最小二乗回帰(LSR)の場合に着目し、ランダム場に依存する二次関数を最小化するための一般確率近似アルゴリズムを解析する。本研究では, 圧縮を含む様々なランダム化機構の解析を可能にするために, ランダム場に対する弱い仮定(具体的には, 予測されたH\"古い正則性)とノイズ共分散について考察する。そして、その結果を連合学習のケースにまで拡張します。より正式には、アルゴリズムによって誘導される付加雑音の共分散$\mathfrak{C}_{\mathrm{ania}}$の収束への影響を強調する。確率場の非正則性にもかかわらず、極限分散項は$\mathrm{Tr}(\mathfrak{C}_{\mathrm{ania}} H^{-1})/K$(ここでは$H$は最適化問題のヘシアンであり、反復数として$K$はバニラ LSR の場合の速度を$\sigma^2 \mathrm{Tr}(H H^{-1}) / K = \sigma^2 d / K$(バッハとムーライン)で表す。次に, 圧縮戦略に対する$\mathfrak{c}_{\mathrm{ania}}$の依存性を解析し, 最終的に収束に与える影響について考察した。

関連論文リスト

Learning general Gaussian mixtures with efficient score matching [16.06356123715737]
我々は、$d$次元で$k$ガウシアンの混合を学習する問題を研究する。我々は、下層の混合成分について分離を前提としない。我々は、ターゲット混合物から$dmathrmpoly(k/varepsilon)$サンプルを抽出し、サンプル-ポリノミカル時間で実行し、サンプリング器を構築するアルゴリズムを提案する。
論文参考訳（メタデータ） (2024-04-29T17:30:36Z)
Rate Analysis of Coupled Distributed Stochastic Approximation for Misspecified Optimization [0.552480439325792]
パラメトリックな特徴を持つ不完全な情報を持つ分散最適化問題として$n$のエージェントを考える。本稿では,各エージェントが未知パラメータの現在の信念を更新する分散近似アルゴリズムを提案する。アルゴリズムの性能に影響を与える因子を定量的に解析し、決定変数の平均二乗誤差が$mathcalO(frac1nk)+mathcalOleft(frac1sqrtn (1-rho_w)right)frac1k1.5で有界であることを証明する。
論文参考訳（メタデータ） (2024-04-21T14:18:49Z)
Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
構造密度の重み付き雑音によるクリップ最適化問題を考察する。勾配が有限の順序モーメントを持つとき、$mathcalO(K-(alpha - 1)/alpha)$よりも高速な収束率が得られることを示す。得られた推定値が無視可能なバイアスと制御可能な分散を持つことを示す。
論文参考訳（メタデータ） (2023-11-07T17:39:17Z)
Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization [49.58290066287418]
構成問題の複雑さを軽減するために,MSVR (Multi-block-probe Variance Reduced) という新しい手法を提案する。本研究の結果は, 試料の複雑さの順序や強靭性への依存など, 様々な面で先行して改善された。
論文参考訳（メタデータ） (2022-07-18T12:03:26Z)
Randomized Coordinate Subgradient Method for Nonsmooth Composite Optimization [11.017632675093628]
非滑らかな問題に対処するコーディネート型劣階法は、リプシッツ型仮定の性質のセットのため、比較的過小評価されている。
論文参考訳（メタデータ） (2022-06-30T02:17:11Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
滑らかな凸凹凸結合型サドル点問題, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$ を考える。漸進的勾配指数(AG-EG)降下指数アルゴリズムについて述べる。
論文参考訳（メタデータ） (2022-06-17T06:10:20Z)
An Improved Analysis of Gradient Tracking for Decentralized Machine Learning [34.144764431505486]
トレーニングデータが$n$エージェントに分散されるネットワーク上での分散機械学習を検討する。エージェントの共通の目標は、すべての局所損失関数の平均を最小化するモデルを見つけることである。ノイズのない場合、$p$を$mathcalO(p-1)$から$mathcalO(p-1)$に改善します。
論文参考訳（メタデータ） (2022-02-08T12:58:14Z)
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance [14.06947898164194]
ヘビーテールは様々なシナリオで勾配降下 (sgd) で現れる。 SGDの収束保証は、潜在的に無限のばらつきを持つ状態依存性および重尾ノイズ下で提供します。その結果,SGDは無限に分散した重尾雑音下であっても,地球最適値に収束できることが示された。
論文参考訳（メタデータ） (2021-02-20T13:45:11Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
非同期Q-ラーニングはマルコフ決定過程(MDP)の最適行動値関数(またはQ-関数)を学習することを目的としている。 Q-関数の入出力$varepsilon$-正確な推定に必要なサンプルの数は、少なくとも$frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$の順である。
論文参考訳（メタデータ） (2020-06-04T17:51:00Z)
Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
本研究では,高次元ガウス混合系の対向ロバスト条件下での効率的な学習性について検討する。理論的に最適に近い誤り証明である$tildeO(epsilon)$の情報を、$epsilon$-corrupted $k$-mixtureで学習するアルゴリズムを提供する。我々の主な技術的貢献は、ガウス混合系からの新しい頑健な識別可能性証明クラスターであり、これは正方形の定度証明システムによって捉えることができる。
論文参考訳（メタデータ） (2020-05-13T16:44:12Z)
Outlier-Robust Clustering of Non-Spherical Mixtures [5.863264019032882]
統計的に分離されたd-次元ガウスアン(k-GMM)の混合をクラスタリングするための最初のアウトリー・ローバストアルゴリズムを与える。この結果は、$d$次元単位球面上の均一分布の任意のアフィン変換のクラスタリング混合に拡張される。
論文参考訳（メタデータ） (2020-05-06T17:24:27Z)
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration [115.1954841020189]
The inequality and non-asymptotic properties of approximation procedure with Polyak-Ruppert averaging。一定のステップサイズと無限大となる反復数を持つ平均的反復数に対する中心極限定理(CLT)を証明する。
論文参考訳（メタデータ） (2020-04-09T17:54:18Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。