Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled
Gradient Descent
- URL: http://arxiv.org/abs/2005.08898v4
- Date: Mon, 14 Jun 2021 20:11:50 GMT
- Title: Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled
Gradient Descent
- Authors: Tian Tong, Cong Ma, Yuejie Chi
- Abstract summary: Low-rank matrix estimation converges convex problem that finds numerous applications in signal processing, machine learning and imaging science.
We show that ScaledGD achieves the best of the best in terms of the number of the low-rank matrix.
Our analysis is also applicable to general loss that are similar to low-rank gradient descent.
- Score: 34.0533596121548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-rank matrix estimation is a canonical problem that finds numerous
applications in signal processing, machine learning and imaging science. A
popular approach in practice is to factorize the matrix into two compact
low-rank factors, and then optimize these factors directly via simple iterative
methods such as gradient descent and alternating minimization. Despite
nonconvexity, recent literatures have shown that these simple heuristics in
fact achieve linear convergence when initialized properly for a growing number
of problems of interest. However, upon closer examination, existing approaches
can still be computationally expensive especially for ill-conditioned matrices:
the convergence rate of gradient descent depends linearly on the condition
number of the low-rank matrix, while the per-iteration cost of alternating
minimization is often prohibitive for large matrices. The goal of this paper is
to set forth a competitive algorithmic approach dubbed Scaled Gradient Descent
(ScaledGD) which can be viewed as pre-conditioned or diagonally-scaled gradient
descent, where the pre-conditioners are adaptive and iteration-varying with a
minimal computational overhead. With tailored variants for low-rank matrix
sensing, robust principal component analysis and matrix completion, we
theoretically show that ScaledGD achieves the best of both worlds: it converges
linearly at a rate independent of the condition number of the low-rank matrix
similar as alternating minimization, while maintaining the low per-iteration
cost of gradient descent. Our analysis is also applicable to general loss
functions that are restricted strongly convex and smooth over low-rank
matrices. To the best of our knowledge, ScaledGD is the first algorithm that
provably has such properties over a wide range of low-rank matrix estimation
tasks.
Related papers
- Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement
Learning [53.445068584013896]
We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure.
In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for example characterize the transition kernel of the MDP.
We show that simple spectral-based matrix estimation approaches efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise error.
arXiv Detail & Related papers (2023-10-10T17:06:41Z) - Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled
Gradient Descent, Even with Overparameterization [48.65416821017865]
This chapter introduces a new algorithmic approach, dubbed scaled gradient (ScaledGD)
It converges linearly at a constant rate independent of the condition number of the low-rank object.
It maintains the low periteration cost of gradient descent for a variety of tasks.
arXiv Detail & Related papers (2023-10-09T21:16:57Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Provable Low Rank Plus Sparse Matrix Separation Via Nonconvex
Regularizers [0.0]
This paper considers a large problem where we seek to recover a low rank matrix/or sparse vector from some set of measurements.
While methods based on convex bias estimators suffer from bias the rank or sparsity to be known as a priori, we use non regularizers.
We present a novel analysis of the proximal alternating bias descent algorithm applied to such problems.
arXiv Detail & Related papers (2021-09-26T22:09:42Z) - Exact Linear Convergence Rate Analysis for Low-Rank Symmetric Matrix
Completion via Gradient Descent [22.851500417035947]
Factorization-based gradient descent is a scalable and efficient algorithm for solving the factorrank matrix completion.
We show that gradient descent enjoys fast convergence to estimate a solution of the global nature problem.
arXiv Detail & Related papers (2021-02-04T03:41:54Z) - Beyond Procrustes: Balancing-Free Gradient Descent for Asymmetric
Low-Rank Matrix Sensing [36.96922859748537]
Low-rank matrix estimation plays a central role in various applications across science and engineering.
Existing approaches rely on adding a metric regularization term to balance the scale of the two matrix factors.
In this paper, we provide a theoretical justification for the performance in recovering a low-rank matrix from a small number of linear measurements.
arXiv Detail & Related papers (2021-01-13T15:03:52Z) - Robust Low-rank Matrix Completion via an Alternating Manifold Proximal
Gradient Continuation Method [47.80060761046752]
Robust low-rank matrix completion (RMC) has been studied extensively for computer vision, signal processing and machine learning applications.
This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix.
A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity).
In this paper, motivated by some recent works on low-
arXiv Detail & Related papers (2020-08-18T04:46:22Z) - A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix
Completion [60.52730146391456]
We propose a new non scalable low-rank regularizer called "nuclear Frobenius norm" regularizer, which is adaptive and sound.
It bypasses the computation of singular values and allows fast optimization by algorithms.
It obtains state-of-the-art recovery performance while being the fastest in existing matrix learning methods.
arXiv Detail & Related papers (2020-08-14T18:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.