Deflated HeteroPCA: Overcoming the curse of ill-conditioning in heteroskedastic PCA
- URL: http://arxiv.org/abs/2303.06198v2
- Date: Sun, 13 Oct 2024 02:58:36 GMT
- Title: Deflated HeteroPCA: Overcoming the curse of ill-conditioning in heteroskedastic PCA
- Authors: Yuchen Zhou, Yuxin Chen,
- Abstract summary: This paper is concerned with estimating the column subspace of a low-rank matrix $boldsymbolXstar in mathbbRn_1times n$ from contaminated data.
How to obtain optimal statistical accuracy while accommodating the widest range of signal-to-noise ratios (SNRs) becomes challenging.
- Score: 17.75853665586128
- License:
- Abstract: This paper is concerned with estimating the column subspace of a low-rank matrix $\boldsymbol{X}^\star \in \mathbb{R}^{n_1\times n_2}$ from contaminated data. How to obtain optimal statistical accuracy while accommodating the widest range of signal-to-noise ratios (SNRs) becomes particularly challenging in the presence of heteroskedastic noise and unbalanced dimensionality (i.e., $n_2\gg n_1$). While the state-of-the-art algorithm $\textsf{HeteroPCA}$ emerges as a powerful solution for solving this problem, it suffers from "the curse of ill-conditioning," namely, its performance degrades as the condition number of $\boldsymbol{X}^\star$ grows. In order to overcome this critical issue without compromising the range of allowable SNRs, we propose a novel algorithm, called $\textsf{Deflated-HeteroPCA}$, that achieves near-optimal and condition-number-free theoretical guarantees in terms of both $\ell_2$ and $\ell_{2,\infty}$ statistical accuracy. The proposed algorithm divides the spectrum of $\boldsymbol{X}^\star$ into well-conditioned and mutually well-separated subblocks, and applies $\textsf{HeteroPCA}$ to conquer each subblock successively. Further, an application of our algorithm and theory to two canonical examples -- the factor model and tensor PCA -- leads to remarkable improvement for each application.
Related papers
- Fair Submodular Cover [18.37610521373708]
We present the study of Fair Submodular Cover (FSC), where given a ground set $U$, a monotone submodular function $f:2UtomathbbR_ge 0$, a threshold $tau$.
We first introduce discrete algorithms for FSC that achieve a bicriteria approximation ratio of $(frac1epsilon, 1-O(epsilon))$.
We then present a continuous algorithm that achieves a $(frac1epsilon, 1-O(epsilon))$-
arXiv Detail & Related papers (2024-07-05T18:37:09Z) - Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems [61.85150061213987]
We study the generalized low-rank matrix bandit problem, proposed in citelu2021low under the Generalized Linear Model (GLM) framework.
To overcome the computational infeasibility and theoretical restrain of existing algorithms, we first propose the G-ESTT framework.
We show that G-ESTT can achieve the $tildeO(sqrt(d_1+d_2)3/2Mr3/2T)$ bound of regret while G-ESTS can achineve the $tildeO
arXiv Detail & Related papers (2024-01-14T14:14:19Z) - A Combinatorial Approach to Robust PCA [18.740048806623037]
We study the problem of recovering Gaussian data under adversarial corruptions.
We assume that the Gaussian noises lie in an unknown $k$-dimensional subspace $U subseteq mathbbRd$, and $s$ randomly chosen coordinates of each data point fall into the control of an adversary.
Our main result is an efficient algorithm that, when $ks2 = O(d)$, recovers every single data point up to a nearly-optimal error of $tilde O(ks/d)$ in expectation.
arXiv Detail & Related papers (2023-11-28T01:49:51Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - DP-PCA: Statistically Optimal and Differentially Private PCA [44.22319983246645]
DP-PCA is a single-pass algorithm that overcomes both limitations.
For sub-Gaussian data, we provide nearly optimal statistical error rates even for $n=tilde O(d)$.
arXiv Detail & Related papers (2022-05-27T02:02:17Z) - Sharper Utility Bounds for Differentially Private Models [20.246768861331276]
First $mathcalObig(fracsqrtpnepsilonbig)$ high probability excess population risk bound for differentially private algorithms.
New proposed algorithm m-NGP improves the performance of the differentially private model over real datasets.
arXiv Detail & Related papers (2022-04-22T07:03:13Z) - The Complexity of Sparse Tensor PCA [1.90365714903665]
For any $1 leq leq k$, our algorithms recover the sparse vector for signal-to-noise ratio $lambda geq tildemathcalO (sqrtt cdot (k/t)p/2)$ in time.
Even in the restricted case of PCA, known algorithms only recover the sparse vectors for $lambda geq tildemathcalO(k cdot r) while our algorithms require $lambda ge
arXiv Detail & Related papers (2021-06-11T10:57:00Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.