One-sided Matrix Completion from Two Observations Per Row
- URL: http://arxiv.org/abs/2306.04049v1
- Date: Tue, 6 Jun 2023 22:35:16 GMT
- Title: One-sided Matrix Completion from Two Observations Per Row
- Authors: Steven Cao, Percy Liang, Gregory Valiant
- Abstract summary: We propose a natural algorithm that involves imputing the missing values of the matrix $XTX$.
We evaluate our algorithm on one-sided recovery of synthetic data and low-coverage genome sequencing.
- Score: 95.87811229292056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given only a few observed entries from a low-rank matrix $X$, matrix
completion is the problem of imputing the missing entries, and it formalizes a
wide range of real-world settings that involve estimating missing data.
However, when there are too few observed entries to complete the matrix, what
other aspects of the underlying matrix can be reliably recovered? We study one
such problem setting, that of "one-sided" matrix completion, where our goal is
to recover the right singular vectors of $X$, even in the regime where
recovering the left singular vectors is impossible, which arises when there are
more rows than columns and very few observations. We propose a natural
algorithm that involves imputing the missing values of the matrix $X^TX$ and
show that even with only two observations per row in $X$, we can provably
recover $X^TX$ as long as we have at least $\Omega(r^2 d \log d)$ rows, where
$r$ is the rank and $d$ is the number of columns. We evaluate our algorithm on
one-sided recovery of synthetic data and low-coverage genome sequencing. In
these settings, our algorithm substantially outperforms standard matrix
completion and a variety of direct factorization methods.
Related papers
- Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly.
We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability.
Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z) - On the well-spread property and its relation to linear regression [4.619541348328937]
We show that consistent recovery of the parameter vector in a robust linear regression model is information-theoretically impossible.
We show that it is possible to efficiently certify whether a given $n$-by-$d$ matrix is well-spread if the number of observations is quadratic in the ambient dimension.
arXiv Detail & Related papers (2022-06-16T11:17:44Z) - Low-rank Matrix Recovery With Unknown Correspondence [62.634051913953485]
We show that it is possible to recover $M$ via solving a nuclear norm minimization problem under a proper low-rank condition on $M$, with provable non-asymptotic error bound for the recovery of $M$.
Experiments on simulated data, the MovieLens 100K dataset and Yale B database show that $textM3textO achieves state-of-the-art performance over several baselines and can recover the ground-truth correspondence with high accuracy.
arXiv Detail & Related papers (2021-10-15T09:27:50Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - Unique sparse decomposition of low rank matrices [17.037882881652617]
We find a unique decomposition of a low rank matrixYin mathbbRrtimes n$.
We prove that up to some $Yin mathRrtimes n$ is a sparse-wise decomposition of $Xin mathbbRrtimes n$.
arXiv Detail & Related papers (2021-06-14T20:05:59Z) - Compressed sensing of low-rank plus sparse matrices [3.8073142980733]
This manuscript develops similar guarantees showing that $mtimes n$ that can be expressed as the sum of a rank-rparse matrix and a $s-sparse matrix can be recovered by computationally tractable methods.
Results are shown for synthetic problems, dynamic-foreground/static separation, multispectral imaging, and Robust PCA.
arXiv Detail & Related papers (2020-07-18T15:36:11Z) - Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss [76.02734481158458]
It is known that in the worst case, to obtain a good rank-$k$ approximation to a matrix, one needs an arbitrarily large $nOmega(1)$ number of columns.
We show that under certain minimal and realistic distributional settings, it is possible to obtain a $(k/epsilon)$-approximation with a nearly linear running time and poly$(k/epsilon)+O(klog n)$ columns.
This is the first algorithm of any kind for achieving a $(k/epsilon)$-approximation for entrywise
arXiv Detail & Related papers (2020-04-16T22:57:06Z) - Optimal Exact Matrix Completion Under new Parametrization [0.0]
We study the problem of exact completion for $m times n$ sized matrix of rank $r$ with the adaptive sampling method.
We propose matrix completion algorithms that exactly recovers the target matrix.
arXiv Detail & Related papers (2020-02-06T18:31:47Z) - Rank $2r$ iterative least squares: efficient recovery of ill-conditioned
low rank matrices from few entries [4.230158563771147]
We present a new, simple and computationally efficient iterative method for low rank matrix completion.
Our algorithm, denoted R2RILS for rank $2r$ iterative least squares, has low memory requirements.
arXiv Detail & Related papers (2020-02-05T16:20:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.