Beyond Procrustes: Balancing-Free Gradient Descent for Asymmetric
Low-Rank Matrix Sensing
- URL: http://arxiv.org/abs/2101.05113v1
- Date: Wed, 13 Jan 2021 15:03:52 GMT
- Title: Beyond Procrustes: Balancing-Free Gradient Descent for Asymmetric
Low-Rank Matrix Sensing
- Authors: Cong Ma, Yuanxin Li, Yuejie Chi
- Abstract summary: Low-rank matrix estimation plays a central role in various applications across science and engineering.
Existing approaches rely on adding a metric regularization term to balance the scale of the two matrix factors.
In this paper, we provide a theoretical justification for the performance in recovering a low-rank matrix from a small number of linear measurements.
- Score: 36.96922859748537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-rank matrix estimation plays a central role in various applications
across science and engineering. Recently, nonconvex formulations based on
matrix factorization are provably solved by simple gradient descent algorithms
with strong computational and statistical guarantees. However, when the
low-rank matrices are asymmetric, existing approaches rely on adding a
regularization term to balance the scale of the two matrix factors which in
practice can be removed safely without hurting the performance when initialized
via the spectral method. In this paper, we provide a theoretical justification
to this for the matrix sensing problem, which aims to recover a low-rank matrix
from a small number of linear measurements. As long as the measurement ensemble
satisfies the restricted isometry property, gradient descent -- in conjunction
with spectral initialization -- converges linearly without the need of
explicitly promoting balancedness of the factors; in fact, the factors stay
balanced automatically throughout the execution of the algorithm. Our analysis
is based on analyzing the evolution of a new distance metric that directly
accounts for the ambiguity due to invertible transforms, and might be of
independent interest.
Related papers
- Guarantees of a Preconditioned Subgradient Algorithm for Overparameterized Asymmetric Low-rank Matrix Recovery [8.722715843502321]
We focus on a matrix factorization-based approach for robust low-rank and asymmetric matrix recovery from corrupted measurements.
We propose a subgradient algorithm that inherits the merits of preconditioned algorithms, whose rate of convergence does not depend on the condition number of the sought matrix.
arXiv Detail & Related papers (2024-10-22T08:58:44Z) - Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled
Gradient Descent, Even with Overparameterization [48.65416821017865]
This chapter introduces a new algorithmic approach, dubbed scaled gradient (ScaledGD)
It converges linearly at a constant rate independent of the condition number of the low-rank object.
It maintains the low periteration cost of gradient descent for a variety of tasks.
arXiv Detail & Related papers (2023-10-09T21:16:57Z) - Asymmetric matrix sensing by gradient descent with small random
initialization [0.8611782340880084]
We study the problem of reconstructing a low-rank matrix from a few linear measurements.
Our key contribution is introducing a continuous gradient flow equation that we call the $texted gradient flow$.
arXiv Detail & Related papers (2023-09-04T20:23:35Z) - The Inductive Bias of Flatness Regularization for Deep Matrix
Factorization [58.851514333119255]
This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in deep linear networks.
We show that for all depth greater than one, with the standard Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters.
arXiv Detail & Related papers (2023-06-22T23:14:57Z) - Implicit Balancing and Regularization: Generalization and Convergence
Guarantees for Overparameterized Asymmetric Matrix Sensing [28.77440901439686]
A series of recent papers have begun to generalize this role for non-random Positive Semi-Defin (PSD) matrix sensing problems.
In this paper, we show that the trajectory of the gradient descent from small random measurements moves towards solutions that are both globally well.
arXiv Detail & Related papers (2023-03-24T19:05:52Z) - Exact Linear Convergence Rate Analysis for Low-Rank Symmetric Matrix
Completion via Gradient Descent [22.851500417035947]
Factorization-based gradient descent is a scalable and efficient algorithm for solving the factorrank matrix completion.
We show that gradient descent enjoys fast convergence to estimate a solution of the global nature problem.
arXiv Detail & Related papers (2021-02-04T03:41:54Z) - Robust Low-rank Matrix Completion via an Alternating Manifold Proximal
Gradient Continuation Method [47.80060761046752]
Robust low-rank matrix completion (RMC) has been studied extensively for computer vision, signal processing and machine learning applications.
This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix.
A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity).
In this paper, motivated by some recent works on low-
arXiv Detail & Related papers (2020-08-18T04:46:22Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled
Gradient Descent [34.0533596121548]
Low-rank matrix estimation converges convex problem that finds numerous applications in signal processing, machine learning and imaging science.
We show that ScaledGD achieves the best of the best in terms of the number of the low-rank matrix.
Our analysis is also applicable to general loss that are similar to low-rank gradient descent.
arXiv Detail & Related papers (2020-05-18T17:17:16Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.