Fugu-MT 論文翻訳(概要): Information-Theoretic Bounds for Sparse Covariance Estimation in the Vertical-Split Distributed Model

論文の概要: Information-Theoretic Bounds for Sparse Covariance Estimation in the Vertical-Split Distributed Model

arxiv url: http://arxiv.org/abs/2606.07124v1
Date: Fri, 05 Jun 2026 10:25:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.691655
Title: Information-Theoretic Bounds for Sparse Covariance Estimation in the Vertical-Split Distributed Model
Title（参考訳）: 垂直分割分散モデルにおけるスパース共分散推定のための情報理論境界
Authors: Jing Yee Tan, Guangyue Han,
Abstract要約: 相互共分散の$C_21$に対して$s$-sparsityを要素的に課すことで、必要な通信やサンプルの複雑さを低減できることを示す。被覆ネット量子化とエントリーワイドのハードしきい値化に基づくマッチングスキームを構築し,ポリ対数因子までの$s$スパースな下界を実現する。
参考スコア（独自算出の注目度）: 10.026496861838448
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the minimax estimation error for distributed covariance matrix estimation in the vertical-split (feature-split) setting, where two agents each observe different coordinates of $m$ i.i.d. sub-Gaussian samples and communicate a limited number of bits to a central server. While Rahmani et al. [2025] established nearly tight bounds for dense (unstructured) cross-covariance matrices, we investigate whether imposing elementwise $s$-sparsity on the cross-covariance $C_{21}$ can reduce the required communication and sample complexity. In contrast to the horizontal-split setting, where Braverman et al. [2016] showed that sparsity does not reduce communication cost for mean estimation, we prove that sparsity does help for cross-covariance estimation in the vertical split. Specifically, we establish minimax lower bounds showing that the communication budget per agent scales as $B_k = Ω(σ^4 d_k\, s' \log(d_1 d_2/s')/\varepsilon^2)$ and the sample complexity for cross-covariance estimation as $m = Ω(σ^4\, s' \log(d_1 d_2/s')/\varepsilon^2)$, where $s' = s \wedge d_{\min}$. For the $1$-sparse case, this yields an exponential improvement from $d_1 d_2$ to $\log(d_1 d_2)$ compared to the dense rate. Our lower bounds are established via Fano's method with an explicit sparse packing using a Varshamov--Gilbert-type argument for signed partial permutation matrices combined with the Conditional Strong Data Processing Inequality of Rahmani et al. [2025]. We show the bounds are tight with a matching achievable scheme, based on covering-net quantization and entry-wise hard thresholding, that attains the $s$-sparse lower bound up to polylogarithmic factors.
Abstract（参考訳）: 垂直分割(Feature-split)設定における分散共分散行列推定の最小値推定誤差について検討し、各エージェントがそれぞれ$m$,d.sub-Gaussianサンプルの異なる座標を観測し、限られたビット数を中央サーバに伝達する。ラーマニら[2025]は、密度(非構造)なクロス共分散行列のほぼ密接な境界を確立する一方で、クロス共分散の$C_{21}$に$s$スパーシティを要素的に課すことで、必要な通信とサンプルの複雑さを低減できるかどうかを調査する。水平スプリット設定とは対照的に、Bravermanらによる2016年の研究では、平均推定における通信コストの減少は、スプリットが垂直分割における相互共分散推定に有効であることを証明している。具体的には、エージェントごとの通信予算が$B_k = Ω(σ^4 d_k\, s' \log(d_1 d_2/s')/\varepsilon^2)$, $m = Ω(σ^4\, s' \log(d_1 d_2/s')/\varepsilon^2)$, $s' = s \wedge d_{\min}$であることを示す。 1ドルスパースの場合、これは高密度な速度と比較して、$d_1 d_2$から$\log(d_1 d_2)$への指数的な改善をもたらす。符号付き部分置換行列に対するVarshamov-Gilbert型引数とRahmaniらによる条件付き強データ処理の不等式を併用した,明示的なスパースパッキングによるFano法により,下界が確立される。有界線は、被覆ネット量子化とエントリーワイドのハードしきい値に基づく整合可能なスキームと密接な関係を示し、これはポリ対数因子までの$s$スパースな下界を実現する。

論文の概要: Information-Theoretic Bounds for Sparse Covariance Estimation in the Vertical-Split Distributed Model

関連論文リスト