Related papers: Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent

Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent

URL: http://arxiv.org/abs/2005.08898v4
Date: Mon, 14 Jun 2021 20:11:50 GMT
Title: Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent
Authors: Tian Tong, Cong Ma, Yuejie Chi
Abstract summary: Low-rank matrix estimation converges convex problem that finds numerous applications in signal processing, machine learning and imaging science. We show that ScaledGD achieves the best of the best in terms of the number of the low-rank matrix. Our analysis is also applicable to general loss that are similar to low-rank gradient descent.
Score: 34.0533596121548
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Low-rank matrix estimation is a canonical problem that finds numerous applications in signal processing, machine learning and imaging science. A popular approach in practice is to factorize the matrix into two compact low-rank factors, and then optimize these factors directly via simple iterative methods such as gradient descent and alternating minimization. Despite nonconvexity, recent literatures have shown that these simple heuristics in fact achieve linear convergence when initialized properly for a growing number of problems of interest. However, upon closer examination, existing approaches can still be computationally expensive especially for ill-conditioned matrices: the convergence rate of gradient descent depends linearly on the condition number of the low-rank matrix, while the per-iteration cost of alternating minimization is often prohibitive for large matrices. The goal of this paper is to set forth a competitive algorithmic approach dubbed Scaled Gradient Descent (ScaledGD) which can be viewed as pre-conditioned or diagonally-scaled gradient descent, where the pre-conditioners are adaptive and iteration-varying with a minimal computational overhead. With tailored variants for low-rank matrix sensing, robust principal component analysis and matrix completion, we theoretically show that ScaledGD achieves the best of both worlds: it converges linearly at a rate independent of the condition number of the low-rank matrix similar as alternating minimization, while maintaining the low per-iteration cost of gradient descent. Our analysis is also applicable to general loss functions that are restricted strongly convex and smooth over low-rank matrices. To the best of our knowledge, ScaledGD is the first algorithm that provably has such properties over a wide range of low-rank matrix estimation tasks.

Related papers

Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization [19.32160757444549]
In practical instances of nonspecified matrix factorization, the rank of the true solutionrstar$ is often unknown, so the rankr$ of the model can be singular as $r>rstar$. We propose an inexpensive suber for matrix sensing variant non matrix factorization that restores the convergence factor back to linear, even in agnosticized case. Our numerical experiments find that PrecGD works equally well in restoring the convergence of other variants non matrix factorization.
arXiv Detail & Related papers (2025-04-13T20:06:49Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning [53.445068584013896]
We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure. In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for example characterize the transition kernel of the MDP. We show that simple spectral-based matrix estimation approaches efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise error.
arXiv Detail & Related papers (2023-10-10T17:06:41Z)
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization [48.65416821017865]
This chapter introduces a new algorithmic approach, dubbed scaled gradient (ScaledGD) It converges linearly at a constant rate independent of the condition number of the low-rank object. It maintains the low periteration cost of gradient descent for a variety of tasks.
arXiv Detail & Related papers (2023-10-09T21:16:57Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
Provable Low Rank Plus Sparse Matrix Separation Via Nonconvex Regularizers [0.0]
This paper considers a large problem where we seek to recover a low rank matrix/or sparse vector from some set of measurements. While methods based on convex bias estimators suffer from bias the rank or sparsity to be known as a priori, we use non regularizers. We present a novel analysis of the proximal alternating bias descent algorithm applied to such problems.
arXiv Detail & Related papers (2021-09-26T22:09:42Z)
Exact Linear Convergence Rate Analysis for Low-Rank Symmetric Matrix Completion via Gradient Descent [22.851500417035947]
Factorization-based gradient descent is a scalable and efficient algorithm for solving the factorrank matrix completion. We show that gradient descent enjoys fast convergence to estimate a solution of the global nature problem.
arXiv Detail & Related papers (2021-02-04T03:41:54Z)
Beyond Procrustes: Balancing-Free Gradient Descent for Asymmetric Low-Rank Matrix Sensing [36.96922859748537]
Low-rank matrix estimation plays a central role in various applications across science and engineering. Existing approaches rely on adding a metric regularization term to balance the scale of the two matrix factors. In this paper, we provide a theoretical justification for the performance in recovering a low-rank matrix from a small number of linear measurements.
arXiv Detail & Related papers (2021-01-13T15:03:52Z)
Robust Low-rank Matrix Completion via an Alternating Manifold Proximal Gradient Continuation Method [47.80060761046752]
Robust low-rank matrix completion (RMC) has been studied extensively for computer vision, signal processing and machine learning applications. This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix. A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity). In this paper, motivated by some recent works on low-
arXiv Detail & Related papers (2020-08-18T04:46:22Z)
A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix Completion [60.52730146391456]
We propose a new non scalable low-rank regularizer called "nuclear Frobenius norm" regularizer, which is adaptive and sound. It bypasses the computation of singular values and allows fast optimization by algorithms. It obtains state-of-the-art recovery performance while being the fastest in existing matrix learning methods.
arXiv Detail & Related papers (2020-08-14T18:47:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.