Global Convergence of Sub-gradient Method for Robust Matrix Recovery:
Small Initialization, Noisy Measurements, and Over-parameterization
- URL: http://arxiv.org/abs/2202.08788v1
- Date: Thu, 17 Feb 2022 17:50:04 GMT
- Title: Global Convergence of Sub-gradient Method for Robust Matrix Recovery:
Small Initialization, Noisy Measurements, and Over-parameterization
- Authors: Jianhao Ma and Salar Fattahi
- Abstract summary: Sub-gradient method (SubGM) is used to recover a low-rank matrix from a limited number of measurements.
We show that SubGM converges to the true solution, even under arbitrarily large and arbitrarily dense noise values.
- Score: 4.7464518249313805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we study the performance of sub-gradient method (SubGM) on a
natural nonconvex and nonsmooth formulation of low-rank matrix recovery with
$\ell_1$-loss, where the goal is to recover a low-rank matrix from a limited
number of measurements, a subset of which may be grossly corrupted with noise.
We study a scenario where the rank of the true solution is unknown and
over-estimated instead. The over-estimation of the rank gives rise to an
over-parameterized model in which there are more degrees of freedom than
needed. Such over-parameterization may lead to overfitting, or adversely affect
the performance of the algorithm. We prove that a simple SubGM with small
initialization is agnostic to both over-parameterization and noise in the
measurements. In particular, we show that small initialization nullifies the
effect of over-parameterization on the performance of SubGM, leading to an
exponential improvement in its convergence rate. Moreover, we provide the first
unifying framework for analyzing the behavior of SubGM under both outlier and
Gaussian noise models, showing that SubGM converges to the true solution, even
under arbitrarily large and arbitrarily dense noise values, and--perhaps
surprisingly--even if the globally optimal solutions do not correspond to the
ground truth. At the core of our results is a robust variant of restricted
isometry property, called Sign-RIP, which controls the deviation of the
sub-differential of the $\ell_1$-loss from that of an ideal, expected loss. As
a byproduct of our results, we consider a subclass of robust low-rank matrix
recovery with Gaussian measurements, and show that the number of required
samples to guarantee the global convergence of SubGM is independent of the
over-parameterized rank.
Related papers
- Symmetric Matrix Completion with ReLU Sampling [15.095194065320987]
We study the problem of symmetric positive semi-definite low-rank matrix completion (MC) with entry-dependent sampling.
In particular, we consider rectified linear unit (ReLU) sampling, where only stationary points are observed.
arXiv Detail & Related papers (2024-06-09T15:14:53Z) - Fundamental limits of Non-Linear Low-Rank Matrix Estimation [18.455890316339595]
Bayes-optimal performances are characterized by an equivalent Gaussian model with an effective prior.
We show that to reconstruct the signal accurately, one requires a signal-to-noise ratio growing as $Nfrac 12 (1-1/k_F)$, where $k_F$ is the first non-zero Fisher information coefficient of the function.
arXiv Detail & Related papers (2024-03-07T05:26:52Z) - Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled
Gradient Descent, Even with Overparameterization [48.65416821017865]
This chapter introduces a new algorithmic approach, dubbed scaled gradient (ScaledGD)
It converges linearly at a constant rate independent of the condition number of the low-rank object.
It maintains the low periteration cost of gradient descent for a variety of tasks.
arXiv Detail & Related papers (2023-10-09T21:16:57Z) - Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact
Recovery [37.05862765171643]
We consider a robust factorization of a low-rank matrix with no prior knowledge on the rank.
We show that our particular design of diminishing the size of the matrix effectively prevents overfitting under overized models.
arXiv Detail & Related papers (2021-09-23T05:54:46Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Implicit Regularization of Sub-Gradient Method in Robust Matrix
Recovery: Don't be Afraid of Outliers [6.320141734801679]
We show that a simple sub-gradient method converges to the true low-rank solution efficiently.
We also build upon a new notion of restricted isometry property, called sign-RIP, to prove the robustness of the method.
arXiv Detail & Related papers (2021-02-05T02:52:00Z) - A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix
Completion [60.52730146391456]
We propose a new non scalable low-rank regularizer called "nuclear Frobenius norm" regularizer, which is adaptive and sound.
It bypasses the computation of singular values and allows fast optimization by algorithms.
It obtains state-of-the-art recovery performance while being the fastest in existing matrix learning methods.
arXiv Detail & Related papers (2020-08-14T18:47:58Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Robust Compressed Sensing using Generative Models [98.64228459705859]
In this paper we propose an algorithm inspired by the Median-of-Means (MOM)
Our algorithm guarantees recovery for heavy-tailed data, even in the presence of outliers.
arXiv Detail & Related papers (2020-06-16T19:07:41Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.