Sharp Global Guarantees for Nonconvex Low-Rank Matrix Recovery in the
Overparameterized Regime
- URL: http://arxiv.org/abs/2104.10790v1
- Date: Wed, 21 Apr 2021 23:07:18 GMT
- Title: Sharp Global Guarantees for Nonconvex Low-Rank Matrix Recovery in the
Overparameterized Regime
- Authors: Richard Y. Zhang
- Abstract summary: We prove that it is possible for non low-rank matrix recovery to contain no spurious local minima.
Without an explicit control over $rstarle r$, an RIP constant $delta1/2$ is both necessary and sufficient for the exact recovery of a rank-$r$ ground truth.
- Score: 16.422215672356167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove that it is possible for nonconvex low-rank matrix recovery to
contain no spurious local minima when the rank of the unknown ground truth
$r^{\star}<r$ is strictly less than the search rank $r$, and yet for the claim
to be false when $r^{\star}=r$. Under the restricted isometry property (RIP),
we prove, for the general overparameterized regime with $r^{\star}\le r$, that
an RIP constant of $\delta<1/(1+\sqrt{r^{\star}/r})$ is sufficient for the
inexistence of spurious local minima, and that
$\delta<1/(1+1/\sqrt{r+r^{\star}-1})$ is necessary due to existence of
counterexamples. Without an explicit control over $r^{\star}\le r$, an RIP
constant of $\delta<1/2$ is both necessary and sufficient for the exact
recovery of a rank-$r$ ground truth. But if the ground truth is known a priori
to have $r^{\star}=1$, then the sharp RIP threshold for exact recovery is
improved to $\delta<1/(1+1/\sqrt{r})$.
Related papers
- Low-rank Matrix Bandits with Heavy-tailed Rewards [55.03293214439741]
We study the problem of underlinelow-rank matrix bandit with underlineheavy-underlinetailed underlinerewards (LowHTR)
By utilizing the truncation on observed payoffs and the dynamic exploration, we propose a novel algorithm called LOTUS.
arXiv Detail & Related papers (2024-04-26T21:54:31Z) - Convergence of Gradient Descent with Small Initialization for
Unregularized Matrix Completion [21.846732043706318]
We show that the vanilla gradient descent provably converges to the ground truth $rmXstar$ without requiring any explicit regularization.
Surprisingly, neither the convergence rate nor the final accuracy depends on the over- parameterized search rank $r'$, and they are only governed by the true rank $r$.
arXiv Detail & Related papers (2024-02-09T19:39:23Z) - Implicit bias of SGD in $L_{2}$-regularized linear DNNs: One-way jumps
from high to low rank [22.850359181621023]
In tasks such as matrix completion, the goal is to converge to the local minimum with the smallest rank that still fits the training data.
We show that with SGD, there is always a probability to jump from a higher rank minimum to a lower rank one, but the probability of jumping back is zero.
arXiv Detail & Related papers (2023-05-25T13:17:32Z) - On the Optimality of Misspecified Kernel Ridge Regression [13.995944403996566]
We show that KRR is minimax optimal for any $sin (0,1)$ when the $mathcalH$ is a Sobolev RKHS.
arXiv Detail & Related papers (2023-05-12T04:12:12Z) - Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR [58.40575099910538]
We study risk-sensitive Reinforcement Learning (RL), focusing on the objective of Conditional Value at Risk (CVaR) with risk tolerance $tau$.
We show the minimax CVaR regret rate is $Omega(sqrttau-1AK)$, where $A$ is the number of actions and $K$ is the number of episodes.
We show that our algorithm achieves the optimal regret of $widetilde O(tau-1sqrtSAK)$ under a continuity assumption and in general attains a near
arXiv Detail & Related papers (2023-02-07T02:22:31Z) - Improved Global Guarantees for the Nonconvex Burer--Monteiro Factorization via Rank Overparameterization [10.787390511207683]
We consider a twice-differentiable, $L$-smooth, $mu$-strongly convex objective $phimph over an $ntimes n$fracfrac14(L/mu-1)2rstar$.
Despite non-locality, local optimization is guaranteed to globally converge from any initial point to the global optimum.
arXiv Detail & Related papers (2022-07-05T03:18:17Z) - Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm.
Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Optimal Spectral Recovery of a Planted Vector in a Subspace [80.02218763267992]
We study efficient estimation and detection of a planted vector $v$ whose $ell_4$ norm differs from that of a Gaussian vector with the same $ell$ norm.
We show that in the regime $n rho gg sqrtN$, any spectral method from a large class (and more generally, any low-degree of the input) fails to detect the planted vector.
arXiv Detail & Related papers (2021-05-31T16:10:49Z) - The Average-Case Time Complexity of Certifying the Restricted Isometry
Property [66.65353643599899]
In compressed sensing, the restricted isometry property (RIP) on $M times N$ sensing matrices guarantees efficient reconstruction of sparse vectors.
We investigate the exact average-case time complexity of certifying the RIP property for $Mtimes N$ matrices with i.i.d. $mathcalN(0,1/M)$ entries.
arXiv Detail & Related papers (2020-05-22T16:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.