Related papers: Fast and Accurate Estimation of Low-Rank Matrices from Noisy Measurements via Preconditioned Non-Convex Gradient Descent

Fast and Accurate Estimation of Low-Rank Matrices from Noisy Measurements via Preconditioned Non-Convex Gradient Descent

URL: http://arxiv.org/abs/2305.17224v2
Date: Wed, 28 Feb 2024 04:14:13 GMT
Title: Fast and Accurate Estimation of Low-Rank Matrices from Noisy Measurements via Preconditioned Non-Convex Gradient Descent
Authors: Gavin Zhang, Hong-Ming Chiu, Richard Y. Zhang
Abstract summary: Non- gradient descent is a common approach for estimating a low-rankntimes n ground truth matrix from noisy measurements. In this paper, we describe how preconditioning should be done for noisy measurements to accelerate local convergence to minimax optimality.
Score: 11.74020933567308
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Non-convex gradient descent is a common approach for estimating a low-rank $n\times n$ ground truth matrix from noisy measurements, because it has per-iteration costs as low as $O(n)$ time, and is in theory capable of converging to a minimax optimal estimate. However, the practitioner is often constrained to just tens to hundreds of iterations, and the slow and/or inconsistent convergence of non-convex gradient descent can prevent a high-quality estimate from being obtained. Recently, the technique of preconditioning was shown to be highly effective at accelerating the local convergence of non-convex gradient descent when the measurements are noiseless. In this paper, we describe how preconditioning should be done for noisy measurements to accelerate local convergence to minimax optimality. For the symmetric matrix sensing problem, our proposed preconditioned method is guaranteed to locally converge to minimax error at a linear rate that is immune to ill-conditioning and/or over-parameterization. Using our proposed preconditioned method, we perform a 60 megapixel medical image denoising task, and observe significantly reduced noise levels compared to previous approaches.

Related papers

Using non-convex optimization in quantum process tomography: Factored gradient descent is tough to beat [11.893324664457552]
Our algorithm converges faster and achieves higher fidelities than state the art, both in terms of settings and noise tolerance. We find our algorithm converges faster and achieves higher fidelities than state the art, both in terms of settings and noise tolerance.
arXiv Detail & Related papers (2023-12-03T07:44:17Z)
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization [48.65416821017865]
This chapter introduces a new algorithmic approach, dubbed scaled gradient (ScaledGD) It converges linearly at a constant rate independent of the condition number of the low-rank object. It maintains the low periteration cost of gradient descent for a variety of tasks.
arXiv Detail & Related papers (2023-10-09T21:16:57Z)
Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys. Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z)
Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization [48.12845778927164]
We consider the prospect of establishing minimax rates for gradient descent in the setting of convex optimization. We consider a common tactic employed in studying gradient methods, whereby the final iterate is corrupted by Gaussian noise, producing a noisy "surrogate" algorithm. Our results suggest that new ideas are required to analyze gradient descent using information-theoretic techniques.
arXiv Detail & Related papers (2022-12-27T17:16:48Z)
Computationally Efficient and Statistically Optimal Robust Low-rank Matrix and Tensor Estimation [15.389011827844572]
Low-rank linear shrinkage estimation undertailed noise is challenging, both computationally statistically. We introduce a novel sub-ient (RsGrad) which is not only computationally efficient but also statistically optimal.
arXiv Detail & Related papers (2022-03-02T09:05:15Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
An Optimal Multistage Stochastic Gradient Method for Minimax Problems [8.615625517708324]
We study the minimax optimization problem in the smooth and strongly convex-strongly concave setting. We first analyze the Gradient Descent Ascent (GDA) method with constant stepsize. We propose a multistage variant of GDA that runs in multiple stages with a particular learning rate decay schedule.
arXiv Detail & Related papers (2020-02-13T18:01:18Z)
Support recovery and sup-norm convergence rates for sparse pivotal estimation [79.13844065776928]
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level. We show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators.
arXiv Detail & Related papers (2020-01-15T16:11:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.