Complexity Guarantees for Polyak Steps with Momentum
- URL: http://arxiv.org/abs/2002.00915v2
- Date: Fri, 3 Jul 2020 15:31:22 GMT
- Title: Complexity Guarantees for Polyak Steps with Momentum
- Authors: Mathieu Barr\'e, Adrien Taylor, Alexandre d'Aspremont
- Abstract summary: We study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$.
We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.
- Score: 76.97851351276165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In smooth strongly convex optimization, knowledge of the strong convexity
parameter is critical for obtaining simple methods with accelerated rates. In
this work, we study a class of methods, based on Polyak steps, where this
knowledge is substituted by that of the optimal value, $f_*$. We first show
slightly improved convergence bounds than previously known for the classical
case of simple gradient descent with Polyak steps, we then derive an
accelerated gradient method with Polyak steps and momentum, along with
convergence guarantees.
Related papers
- Non-convex Stochastic Composite Optimization with Polyak Momentum [25.060477077577154]
The generalization proximal gradient is a powerful generalization of the widely used gradient descent (SGD) method.
However, it is notoriously known that method fails to converge in numerous applications in Machine.
arXiv Detail & Related papers (2024-03-05T13:43:58Z) - Incremental Quasi-Newton Methods with Faster Superlinear Convergence
Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian.
The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate.
This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z) - Acceleration and Implicit Regularization in Gaussian Phase Retrieval [5.484345596034159]
In this setting, we prove that methods with Polyak or Nesterov momentum implicit regularization ensures a nice convex descent.
Experimental evidence shows that these methods converge faster than gradient descent in practice.
arXiv Detail & Related papers (2023-11-21T04:10:03Z) - Polynomial Preconditioning for Gradient Methods [9.13755431537592]
We propose a new family of preconditioners generated by symmetrics.
They provide first-order optimization methods with a provable improvement of the condition number.
We show how to incorporate a preconditioning into the Gradient and Fast Gradient Methods and establish the corresponding global complexity.
arXiv Detail & Related papers (2023-01-30T18:55:38Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - On the Convergence Rates of Policy Gradient Methods [9.74841674275568]
We consider geometrically discounted dominance problems with finite state sub spaces.
We show that with direct gradient pararization in a sample we can analyze the general complexity of a gradient.
arXiv Detail & Related papers (2022-01-19T07:03:37Z) - A Discrete Variational Derivation of Accelerated Methods in Optimization [68.8204255655161]
We introduce variational which allow us to derive different methods for optimization.
We derive two families of optimization methods in one-to-one correspondence.
The preservation of symplecticity of autonomous systems occurs here solely on the fibers.
arXiv Detail & Related papers (2021-06-04T20:21:53Z) - Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods.
We discuss momentum methods in detail, starting with the seminal work of Nesterov.
We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z) - Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
Clipping [69.9674326582747]
We propose a new accelerated first-order method called clipped-SSTM for smooth convex optimization with heavy-tailed distributed noise in gradients.
We prove new complexity that outperform state-of-the-art results in this case.
We derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.
arXiv Detail & Related papers (2020-05-21T17:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.