Related papers: Leveraged Matrix Completion with Noise

Leveraged Matrix Completion with Noise

URL: http://arxiv.org/abs/2011.05885v2
Date: Mon, 14 Aug 2023 10:15:17 GMT
Title: Leveraged Matrix Completion with Noise
Authors: Xinjian Huang and Weiwei Liu and Bo Du and Dacheng Tao
Abstract summary: We show that we can provably recover an unknown $ntimes n$ matrix of rank $r$ from just about $mathcalO(nrlog2 (n))$ entries. Our proofs are supported by a novel approach that phrases sufficient optimality conditions based on the Golfing Scheme.
Score: 84.20092979053119
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Completing low-rank matrices from subsampled measurements has received much attention in the past decade. Existing works indicate that $\mathcal{O}(nr\log^2(n))$ datums are required to theoretically secure the completion of an $n \times n$ noisy matrix of rank $r$ with high probability, under some quite restrictive assumptions: (1) the underlying matrix must be incoherent; (2) observations follow the uniform distribution. The restrictiveness is partially due to ignoring the roles of the leverage score and the oracle information of each element. In this paper, we employ the leverage scores to characterize the importance of each element and significantly relax assumptions to: (1) not any other structure assumptions are imposed on the underlying low-rank matrix; (2) elements being observed are appropriately dependent on their importance via the leverage score. Under these assumptions, instead of uniform sampling, we devise an ununiform/biased sampling procedure that can reveal the ``importance'' of each observed element. Our proofs are supported by a novel approach that phrases sufficient optimality conditions based on the Golfing Scheme, which would be of independent interest to the wider areas. Theoretical findings show that we can provably recover an unknown $n\times n$ matrix of rank $r$ from just about $\mathcal{O}(nr\log^2 (n))$ entries, even when the observed entries are corrupted with a small amount of noisy information. The empirical results align precisely with our theories.

Related papers

Surmise for random matrices' level spacing distributions beyond nearest-neighbors [0.0]
Correlations between energy levels can help distinguish whether a many-body system is of integrable or chaotic nature. For nearest-neighbor (NN) spectral spacings, the distribution in random matrices is well captured by the Wigner surmise. We propose a corrected surmise for the $k$NN spectral distributions.
arXiv Detail & Related papers (2025-04-28T18:00:00Z)
Optimal level set estimation for non-parametric tournament and crowdsourcing problems [49.75262185577198]
Motivated by crowdsourcing, we consider a problem where we partially observe the correctness of the answers of $n$ experts on $d$ questions. In this paper, we assume that the matrix $M$ containing the probability that expert $i$ answers correctly to question $j$ is bi-isotonic up to a permutation of it rows and columns. We construct an efficient-time algorithm that turns out to be minimax optimal for this classification problem.
arXiv Detail & Related papers (2024-08-27T18:28:31Z)
Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition. A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues. We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z)
One-sided Matrix Completion from Two Observations Per Row [95.87811229292056]
We propose a natural algorithm that involves imputing the missing values of the matrix $XTX$. We evaluate our algorithm on one-sided recovery of synthetic data and low-coverage genome sequencing.
arXiv Detail & Related papers (2023-06-06T22:35:16Z)
Robust Matrix Completion with Heavy-tailed Noise [0.5837881923712392]
This paper studies low-rank matrix completion in the presence of heavy-tailed possibly asymmetric noise. In this paper, we adopt adaptive Huber loss accommodate heavy-tailed noise, which is robust against large and possibly asymmetric errors. We prove that under merely a second moment condition on the error, the Euclidean error falls geometrically fast until achieving a minimax-optimal statistical estimation error.
arXiv Detail & Related papers (2022-06-09T04:48:48Z)
Robust Linear Regression for General Feature Distribution [21.0709900887309]
We investigate robust linear regression where data may be contaminated by an oblivious adversary. We do not necessarily assume that the features are centered. If the features are centered we can obtain a standard convergence rate.
arXiv Detail & Related papers (2022-02-04T11:22:13Z)
Robust Linear Predictions: Analyses of Uniform Concentration, Fast Rates and Model Misspecification [16.0817847880416]
We offer a unified framework that includes a broad variety of linear prediction problems on a Hilbert space. We show that for misspecification level $epsilon$, these estimators achieve an error rate of $O(maxleft|mathcalO|1/2n-1/2, |mathcalI|1/2n-1 right+epsilon)$, matching the best-known rates in literature.
arXiv Detail & Related papers (2022-01-06T08:51:08Z)
Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers [13.244654316770815]
We develop machinery to design efficiently computable and consistent estimators. For sparse regression, we achieve consistency for optimal sample size $ngsim (klog d)/alpha2$. In the context of PCA, we attain optimal error guarantees under broad spikiness assumptions on the parameter matrix.
arXiv Detail & Related papers (2021-11-04T15:59:44Z)
Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z)
Sharp Statistical Guarantees for Adversarially Robust Gaussian Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification. Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z)
L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system. A novel $L2R2$ approach is proposed under the learning-to-rank framework. Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z)
Tackling small eigen-gaps: Fine-grained eigenvector estimation and inference under heteroscedastic noise [28.637772416856194]
Two fundamental challenges arise in eigenvector estimation and inference for a low-rank matrix from noisy observations. We propose estimation and uncertainty quantification procedures for an unknown eigenvector. We establish optimal procedures to construct confidence intervals for the unknown eigenvalues.
arXiv Detail & Related papers (2020-01-14T04:26:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.