Fugu-MT 論文翻訳(概要): Oja's Algorithm for Streaming Sparse PCA

論文の概要: Oja's Algorithm for Streaming Sparse PCA

arxiv url: http://arxiv.org/abs/2402.07240v5
Date: Sun, 10 Nov 2024 04:33:34 GMT
ステータス: 翻訳完了
システム内更新日: 2024-11-28 17:07:30.724783
Title: Oja's Algorithm for Streaming Sparse PCA
Title（参考訳）: OjaのスパースPCAストリーミングアルゴリズム
Authors: Syamantak Kumar, Purnamrita Sarkar,
Abstract要約: Oja's Algorithm for Streaming principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_mathsfeff/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time。 Ojaのアルゴリズムの出力をしきい値にする単純なシングルパス手順は、$O(d)$ space と $O(nd)$ time の正則性条件下での最小誤差を達成できることを示す。
参考スコア（独自算出の注目度）: 7.059472280274011
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Oja's algorithm for Streaming Principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a single pass through the datapoints. Here $r_{\mathsf{eff}}$ is the effective rank (ratio of the trace and the principal eigenvalue of the population covariance matrix $\Sigma$). Under this computational budget, we consider the problem of sparse PCA, where the principal eigenvector of $\Sigma$ is $s$-sparse, and $r_{\mathsf{eff}}$ can be large. In this setting, to our knowledge, \textit{there are no known single-pass algorithms} that achieve the minimax error bound in $O(d)$ space and $O(nd)$ time without either requiring strong initialization conditions or assuming further structure (e.g., spiked) of the covariance matrix. We show that a simple single-pass procedure that thresholds the output of Oja's algorithm (the Oja vector) can achieve the minimax error bound under some regularity conditions in $O(d)$ space and $O(nd)$ time. We present a nontrivial and novel analysis of the entries of the unnormalized Oja vector, which involves the projection of a product of independent random matrices on a random initial vector. This is completely different from previous analyses of Oja's algorithm and matrix products, which have been done when the $r_{\mathsf{eff}}$ is bounded.
Abstract（参考訳）: Oja's Algorithm for Streaming principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a single pass the datapoints。ここで、r_{\mathsf{eff}}$は有効ランクである(トレースの比と集団共分散行列の主固有値$\Sigma$)。この計算予算の下では、$\Sigma$の固有ベクトルが$s$-sparseであり、$r_{\mathsf{eff}}$が大きければ、スパースPCAの問題を考える。この設定では、我々の知る限り、$O(d)$ space と $O(nd)$ time のミニマックス誤差を強い初期化条件を必要とせず、あるいは共分散行列のさらなる構造(例えば、スパイク)を仮定することなく達成できるような、既知のシングルパスアルゴリズムは存在しない。 Ojaのアルゴリズム(Ojaベクトル)の出力をしきい値にする単純なシングルパス手順は、$O(d)$ space と $O(nd)$ time の正則性条件下でのミニマックス誤差を達成できることを示す。ランダム初期ベクトル上の独立確率行列の積の射影を含む非正規化 Oja ベクトルの成分の非自明で斬新な解析を行う。これは、$r_{\mathsf{eff}}$が有界であるときになされたOjaのアルゴリズムと行列積の以前の分析とは全く異なる。

論文の概要: Oja's Algorithm for Streaming Sparse PCA

関連論文リスト