Fast Global Convergence for Low-rank Matrix Recovery via Riemannian
Gradient Descent with Random Initialization
- URL: http://arxiv.org/abs/2012.15467v1
- Date: Thu, 31 Dec 2020 06:40:43 GMT
- Title: Fast Global Convergence for Low-rank Matrix Recovery via Riemannian
Gradient Descent with Random Initialization
- Authors: Thomas Y. Hou, Zhenzhen Li, Ziyun Zhang
- Abstract summary: We analyze the global behavior for a class of low-rank matrix recovery problems on the Riemannian manifold.
We reveal a previously unknown geometric property of the low-rank matrix manifold, which is the existence of spurious critical points for the simple least squares function.
Our convergence guarantee is nearly optimal and almost dimension-free, which fully explains the numerical observations.
- Score: 3.0263650731957275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a new global analysis framework for a class of
low-rank matrix recovery problems on the Riemannian manifold. We analyze the
global behavior for the Riemannian optimization with random initialization. We
use the Riemannian gradient descent algorithm to minimize a least squares loss
function, and study the asymptotic behavior as well as the exact convergence
rate. We reveal a previously unknown geometric property of the low-rank matrix
manifold, which is the existence of spurious critical points for the simple
least squares function on the manifold. We show that under some assumptions,
the Riemannian gradient descent starting from a random initialization with high
probability avoids these spurious critical points and only converges to the
ground truth in nearly linear convergence rate, i.e.
$\mathcal{O}(\text{log}(\frac{1}{\epsilon})+ \text{log}(n))$ iterations to
reach an $\epsilon$-accurate solution. We use two applications as examples for
our global analysis. The first one is a rank-1 matrix recovery problem. The
second one is the Gaussian phase retrieval problem. The second example only
satisfies the weak isometry property, but has behavior similar to that of the
first one except for an extra saddle set. Our convergence guarantee is nearly
optimal and almost dimension-free, which fully explains the numerical
observations. The global analysis can be potentially extended to other data
problems with random measurement structures and empirical least squares loss
functions.
Related papers
- ADMM for Structured Fractional Minimization [7.9047096855236125]
We consider a class of structured fractional problems, where the numerator includes a differentiable function.
We introduce sf FADMM, the first Alternating Method of Multipliers for this class of problems.
arXiv Detail & Related papers (2024-11-12T02:50:12Z) - Riemannian Optimization for Non-convex Euclidean Distance Geometry with Global Recovery Guarantees [6.422262171968397]
Two algorithms are proposed to solve the Euclidean Distance Geometry problem.
First algorithm converges linearly to the true solution.
Second algorithm demonstrates strong numerical performance on both synthetic and real data.
arXiv Detail & Related papers (2024-10-08T21:19:22Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Rank-1 Matrix Completion with Gradient Descent and Small Random
Initialization [15.127728811011245]
We show that implicit regularization of GD plays a critical role in analysis.
We observe that implicit regularization GD plays a critical role in affordable analysis.
arXiv Detail & Related papers (2022-12-19T12:05:37Z) - Randomized Coordinate Subgradient Method for Nonsmooth Composite
Optimization [11.017632675093628]
Coordinate-type subgradient methods for addressing nonsmooth problems are relatively underexplored due to the set of properties of the Lipschitz-type assumption.
arXiv Detail & Related papers (2022-06-30T02:17:11Z) - Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix
Factorization [49.090785356633695]
We study the asymmetric low-rank factorization problem: [mathbfU in mathbbRm min d, mathbfU$ and mathV$.
arXiv Detail & Related papers (2021-06-27T17:25:24Z) - Proximal Gradient Descent-Ascent: Variable Convergence under K{\L}
Geometry [49.65455534654459]
The finite descent-ascent parameters (GDA) has been widely applied to solve minimax optimization problems.
This paper fills such a gap by studying the convergence of the KL-Lized geometry.
arXiv Detail & Related papers (2021-02-09T05:35:53Z) - ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms.
We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients.
We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z) - Convergence Analysis of Riemannian Stochastic Approximation Schemes [39.32179384256228]
This paper analyzes a class of correlated approximation (SA) schemes to tackle optimization problems.
We show that the conditions we derive are considerably milder than previous works.
Third, we consider the case where the mean-field function can only be estimated up to a small bias, and/or the case in which the samples are drawn from a chain.
arXiv Detail & Related papers (2020-05-27T11:24:58Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.