Sparse PCA with Oracle Property
- URL: http://arxiv.org/abs/2312.16793v1
- Date: Thu, 28 Dec 2023 02:52:54 GMT
- Title: Sparse PCA with Oracle Property
- Authors: Quanquan Gu and Zhaoran Wang and Han Liu
- Abstract summary: We propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations.
We prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA.
- Score: 115.72363972222622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the estimation of the $k$-dimensional sparse
principal subspace of covariance matrix $\Sigma$ in the high-dimensional
setting. We aim to recover the oracle principal subspace solution, i.e., the
principal subspace estimator obtained assuming the true support is known a
priori. To this end, we propose a family of estimators based on the
semidefinite relaxation of sparse PCA with novel regularizations. In
particular, under a weak assumption on the magnitude of the population
projection matrix, one estimator within this family exactly recovers the true
support with high probability, has exact rank-$k$, and attains a $\sqrt{s/n}$
statistical rate of convergence with $s$ being the subspace sparsity level and
$n$ the sample size. Compared to existing support recovery results for sparse
PCA, our approach does not hinge on the spiked covariance model or the limited
correlation condition. As a complement to the first estimator that enjoys the
oracle property, we prove that, another estimator within the family achieves a
sharper statistical rate of convergence than the standard semidefinite
relaxation of sparse PCA, even when the previous assumption on the magnitude of
the projection matrix is violated. We validate the theoretical results by
numerical experiments on synthetic datasets.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Black-Box $k$-to-$1$-PCA Reductions: Theory and Applications [19.714951004096996]
We analyze black-box deflation methods as a framework for designing $k$-PCA algorithms.
Our main contribution is significantly sharper bounds on the approximation parameter degradation of deflation methods for $k$-PCA.
We apply our framework to obtain state-of-the-art $k$-PCA algorithms robust to dataset contamination.
arXiv Detail & Related papers (2024-03-06T18:07:20Z) - Entrywise Recovery Guarantees for Sparse PCA via Sparsistent Algorithms [9.112172220055431]
We provide entrywise $ell_2,infty$ bounds for Sparse PCA under a general high-dimensional subgaussian design.
Our results hold for any algorithm that selects the correct support with high probability, those that are sparsistent.
arXiv Detail & Related papers (2022-02-08T18:50:35Z) - Support Recovery with Stochastic Gates: Theory and Application for
Linear Models [9.644417971611908]
We analyze the problem of simultaneous support recovery and estimation of the coefficient vector ($beta*$) in a linear model with independent and identically distributed Normal errors.
Considering design we show that under reasonable conditions on dimension and sparsity of $beta*$ the STG based estimator converges to the true data generating coefficient vector and also detects its support set with high probability.
arXiv Detail & Related papers (2021-10-29T17:59:43Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Low-Rank Matrix Estimation From Rank-One Projections by Unlifted Convex
Optimization [9.492903649862761]
We study an estimator with a formulation convex for recovery of low-rank matrices from rank-one projections.
We show that under both models the estimator succeeds, with high probability, if the number of measurements exceeds $r2 (d+d_$2) up.
arXiv Detail & Related papers (2020-04-06T14:57:54Z) - Distributed Estimation for Principal Component Analysis: an Enlarged
Eigenspace Analysis [45.829683377074524]
This paper studies distributed estimation for a fundamental statistical machine learning problem, principal component analysis (PCA)
We propose a novel multi-round algorithm for constructing top-$L$-dim eigenspace for distributed data.
Our algorithm takes advantage of shift-and-invert preconditioning and convex optimization.
arXiv Detail & Related papers (2020-04-05T22:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.