Leveraged Matrix Completion with Noise
- URL: http://arxiv.org/abs/2011.05885v2
- Date: Mon, 14 Aug 2023 10:15:17 GMT
- Title: Leveraged Matrix Completion with Noise
- Authors: Xinjian Huang and Weiwei Liu and Bo Du and Dacheng Tao
- Abstract summary: We show that we can provably recover an unknown $ntimes n$ matrix of rank $r$ from just about $mathcalO(nrlog2 (n))$ entries.
Our proofs are supported by a novel approach that phrases sufficient optimality conditions based on the Golfing Scheme.
- Score: 84.20092979053119
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Completing low-rank matrices from subsampled measurements has received much
attention in the past decade. Existing works indicate that
$\mathcal{O}(nr\log^2(n))$ datums are required to theoretically secure the
completion of an $n \times n$ noisy matrix of rank $r$ with high probability,
under some quite restrictive assumptions: (1) the underlying matrix must be
incoherent; (2) observations follow the uniform distribution. The
restrictiveness is partially due to ignoring the roles of the leverage score
and the oracle information of each element. In this paper, we employ the
leverage scores to characterize the importance of each element and
significantly relax assumptions to: (1) not any other structure assumptions are
imposed on the underlying low-rank matrix; (2) elements being observed are
appropriately dependent on their importance via the leverage score. Under these
assumptions, instead of uniform sampling, we devise an ununiform/biased
sampling procedure that can reveal the ``importance'' of each observed element.
Our proofs are supported by a novel approach that phrases sufficient optimality
conditions based on the Golfing Scheme, which would be of independent interest
to the wider areas. Theoretical findings show that we can provably recover an
unknown $n\times n$ matrix of rank $r$ from just about $\mathcal{O}(nr\log^2
(n))$ entries, even when the observed entries are corrupted with a small amount
of noisy information. The empirical results align precisely with our theories.
Related papers
- Optimal level set estimation for non-parametric tournament and crowdsourcing problems [49.75262185577198]
Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of $n$ experts on $d$ questions.
In this paper, we assume that the matrix $M$ containing the probability that expert $i$ answers correctly to question $j$ is bi-isotonic up to a permutation of it rows and columns.
We construct an efficient-time algorithm that turns out to be minimax optimal for this classification problem.
arXiv Detail & Related papers (2024-08-27T18:28:31Z) - Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition.
A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues.
We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z) - One-sided Matrix Completion from Two Observations Per Row [95.87811229292056]
We propose a natural algorithm that involves imputing the missing values of the matrix $XTX$.
We evaluate our algorithm on one-sided recovery of synthetic data and low-coverage genome sequencing.
arXiv Detail & Related papers (2023-06-06T22:35:16Z) - Robust Matrix Completion with Heavy-tailed Noise [0.5837881923712392]
This paper studies low-rank matrix completion in the presence of heavy-tailed possibly asymmetric noise.
In this paper, we adopt adaptive Huber loss accommodate heavy-tailed noise, which is robust against large and possibly asymmetric errors.
We prove that under merely a second moment condition on the error, the Euclidean error falls geometrically fast until achieving a minimax-optimal statistical estimation error.
arXiv Detail & Related papers (2022-06-09T04:48:48Z) - Robust Linear Regression for General Feature Distribution [21.0709900887309]
We investigate robust linear regression where data may be contaminated by an oblivious adversary.
We do not necessarily assume that the features are centered.
If the features are centered we can obtain a standard convergence rate.
arXiv Detail & Related papers (2022-02-04T11:22:13Z) - Robust Linear Predictions: Analyses of Uniform Concentration, Fast Rates
and Model Misspecification [16.0817847880416]
We offer a unified framework that includes a broad variety of linear prediction problems on a Hilbert space.
We show that for misspecification level $epsilon$, these estimators achieve an error rate of $O(maxleft|mathcalO|1/2n-1/2, |mathcalI|1/2n-1 right+epsilon)$, matching the best-known rates in literature.
arXiv Detail & Related papers (2022-01-06T08:51:08Z) - Consistent Estimation for PCA and Sparse Regression with Oblivious
Outliers [13.244654316770815]
We develop machinery to design efficiently computable and consistent estimators.
For sparse regression, we achieve consistency for optimal sample size $ngsim (klog d)/alpha2$.
In the context of PCA, we attain optimal error guarantees under broad spikiness assumptions on the parameter matrix.
arXiv Detail & Related papers (2021-11-04T15:59:44Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Tackling small eigen-gaps: Fine-grained eigenvector estimation and
inference under heteroscedastic noise [28.637772416856194]
Two fundamental challenges arise in eigenvector estimation and inference for a low-rank matrix from noisy observations.
We propose estimation and uncertainty quantification procedures for an unknown eigenvector.
We establish optimal procedures to construct confidence intervals for the unknown eigenvalues.
arXiv Detail & Related papers (2020-01-14T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.