論文の概要: Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns
- arxiv url: http://arxiv.org/abs/2109.02644v1
- Date: Mon, 6 Sep 2021 14:21:43 GMT
- Title: Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns
- Title(参考訳): 独立な非同分布列を持つランダム行列から生じるサンプル共分散行列のスペクトル特性
- Authors: Cosme Louart and Romain Couillet
- Abstract要約: 関数 $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
ここでは、$|mathbb E[R(z)] - tilde R(z)|_F を示す。
- 参考スコア(独自算出の注目度): 50.053491972003656
- Abstract: Given a random matrix $X= (x_1,\ldots, x_n)\in \mathcal M_{p,n}$ with
independent columns and satisfying concentration of measure hypotheses and a
parameter $z$ whose distance to the spectrum of $\frac{1}{n} XX^T$ should not
depend on $p,n$, it was previously shown that the functionals
$\text{tr}(AR(z))$, for $R(z) = (\frac{1}{n}XX^T- zI_p)^{-1}$ and $A\in
\mathcal M_{p}$ deterministic, have a standard deviation of order $O(\|A\|_* /
\sqrt n)$. Here, we show that $\|\mathbb E[R(z)] - \tilde R(z)\|_F \leq
O(1/\sqrt n)$, where $\tilde R(z)$ is a deterministic matrix depending only on
$z$ and on the means and covariances of the column vectors $x_1,\ldots, x_n$
(that do not have to be identically distributed). This estimation is key to
providing accurate fluctuation rates of functionals of $X$ of interest (mostly
related to its spectral properties) and is proved thanks to the introduction of
a semi-metric $d_s$ defined on the set $\mathcal D_n(\mathbb H)$ of diagonal
matrices with complex entries and positive imaginary part and satisfying, for
all $D,D' \in \mathcal D_n(\mathbb H)$: $d_s(D,D') = \max_{i\in[n]} |D_i -
D_i'|/ (\Im(D_i) \Im(D_i'))^{1/2}$. Possibly most importantly, the underlying
concentration of measure assumption on the columns of $X$ finds an extremely
natural ground for application in modern statistical machine learning
algorithms where non-linear Lipschitz mappings and high number of classes form
the base ingredients.
- Abstract(参考訳): ランダム行列 $X= (x_1,\ldots, x_n)\in \mathcal M_{p,n}$ が独立な列を持ち、測度仮説の濃度を満足するパラメータ $z$ が$\frac{1}{n} XX^T$ のスペクトルまでの距離が $p,n$ に依存しないことを与えられたとき、関数 $\text{tr}(AR(z))$, for $R(z) = (\frac{1}{n}XX^T-zI_p)^{-1}$ と $A\in \mathcal M_{p}$ の標準偏差が $O(\|A|\|*\sq \rt$)$ であることを示した。
ここでは、$\|\mathbb E[R(z)] - \tilde R(z)\|_F \leq O(1/\sqrt n)$, ここで、$\tilde R(z)$ は $z$ にのみ依存する決定論的行列であり、列ベクトル $x_1,\ldots, x_n$ の手段と共分散に依存する。
この推定は、$X$の関数の正確なゆらぎ率(主にスペクトル特性に関連する)を提供する鍵であり、複素エントリと正の虚部を持ち、すべての$D,D' \in \mathcal D_n(\mathbb H)$: $d_s(D,D') = \max_{i\in[n]} |D_iD_i'|/(\Im(D_i) \Im(D_i)^{1/2}$で定義される半計量の$d_s$の導入によって証明される。
