Related papers: Complexity Guarantees for Polyak Steps with Momentum

Complexity Guarantees for Polyak Steps with Momentum

URL: http://arxiv.org/abs/2002.00915v2
Date: Fri, 3 Jul 2020 15:31:22 GMT
Title: Complexity Guarantees for Polyak Steps with Momentum
Authors: Mathieu Barr\'e, Adrien Taylor, Alexandre d'Aspremont
Abstract summary: We study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.
Score: 76.97851351276165
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In smooth strongly convex optimization, knowledge of the strong convexity parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.

Related papers

Non-convex Stochastic Composite Optimization with Polyak Momentum [25.060477077577154]
The generalization proximal gradient is a powerful generalization of the widely used gradient descent (SGD) method. However, it is notoriously known that method fails to converge in numerous applications in Machine.
arXiv Detail & Related papers (2024-03-05T13:43:58Z)
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian. The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate. This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z)
Acceleration and Implicit Regularization in Gaussian Phase Retrieval [5.484345596034159]
In this setting, we prove that methods with Polyak or Nesterov momentum implicit regularization ensures a nice convex descent. Experimental evidence shows that these methods converge faster than gradient descent in practice.
arXiv Detail & Related papers (2023-11-21T04:10:03Z)
Polynomial Preconditioning for Gradient Methods [9.13755431537592]
We propose a new family of preconditioners generated by symmetrics. They provide first-order optimization methods with a provable improvement of the condition number. We show how to incorporate a preconditioning into the Gradient and Fast Gradient Methods and establish the corresponding global complexity.
arXiv Detail & Related papers (2023-01-30T18:55:38Z)
Min-Max Optimization Made Simple: Approximating the Proximal Point Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems. Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z)
On the Convergence Rates of Policy Gradient Methods [9.74841674275568]
We consider geometrically discounted dominance problems with finite state sub spaces. We show that with direct gradient pararization in a sample we can analyze the general complexity of a gradient.
arXiv Detail & Related papers (2022-01-19T07:03:37Z)
A Discrete Variational Derivation of Accelerated Methods in Optimization [68.8204255655161]
We introduce variational which allow us to derive different methods for optimization. We derive two families of optimization methods in one-to-one correspondence. The preservation of symplecticity of autonomous systems occurs here solely on the fibers.
arXiv Detail & Related papers (2021-06-04T20:21:53Z)
Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods. We discuss momentum methods in detail, starting with the seminal work of Nesterov. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z)
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping [69.9674326582747]
We propose a new accelerated first-order method called clipped-SSTM for smooth convex optimization with heavy-tailed distributed noise in gradients. We prove new complexity that outperform state-of-the-art results in this case. We derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.
arXiv Detail & Related papers (2020-05-21T17:05:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.