Related papers: Acceleration Methods

Acceleration Methods

URL: http://arxiv.org/abs/2101.09545v4
Date: Tue, 24 Sep 2024 20:19:22 GMT
Title: Acceleration Methods
Authors: Alexandre d'Aspremont, Damien Scieur, Adrien Taylor,
Abstract summary: We first use quadratic optimization problems to introduce two key families of acceleration methods. We discuss momentum methods in detail, starting with the seminal work of Nesterov. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
Score: 57.202881673406324
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.

Related papers

A Learn-to-Optimize Approach for Coordinate-Wise Step Sizes for Quasi-Newton Methods [9.82454981262489]
We introduce a learn-to-optimize (L2O) method that employs LSTM-based networks to learn optimal step sizes.<n>Our approach achieves substantial improvements over scalar step size methods and hypergradient descent-based method.
arXiv Detail & Related papers (2024-11-25T07:13:59Z)
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian. The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate. This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Formal guarantees for heuristic optimization algorithms used in machine learning [6.978625807687497]
Gradient Descent (SGD) and its variants have become the dominant methods in the large-scale optimization machine learning (ML) problems. We provide formal guarantees of a few convex optimization methods and proposing improved algorithms.
arXiv Detail & Related papers (2022-07-31T19:41:22Z)
A Discrete Variational Derivation of Accelerated Methods in Optimization [68.8204255655161]
We introduce variational which allow us to derive different methods for optimization. We derive two families of optimization methods in one-to-one correspondence. The preservation of symplecticity of autonomous systems occurs here solely on the fibers.
arXiv Detail & Related papers (2021-06-04T20:21:53Z)
Accelerated, Optimal, and Parallel: Some Results on Model-Based Stochastic Optimization [33.71051480619541]
We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving convex optimization problems. We provide non-asymptotic convergence guarantees and an acceleration scheme for which we provide linear speedup in minibatch size. We show improved convergence rates and matching lower bounds identifying new fundamental constants for "interpolation" problems.
arXiv Detail & Related papers (2021-01-07T18:58:39Z)
A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints [10.578409461429626]
In this work, we adapt the integral quadratic constraints theory to first-order methods for smooth and strongly-varying games and iteration. We provide emphfor the first time a global convergence rate for the negative momentum method(NM) with an complexity $mathcalO(kappa1.5)$, which matches its known lower bound. We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch.
arXiv Detail & Related papers (2020-09-23T20:02:00Z)
IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex. Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method. When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z)
Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart for Nonconvex Optimization [73.38702974136102]
Various types of parameter restart schemes have been proposed for accelerated algorithms to facilitate their practical convergence in rates. In this paper, we propose an algorithm for solving nonsmooth problems.
arXiv Detail & Related papers (2020-02-26T16:06:27Z)
Average-case Acceleration Through Spectral Density Estimation [35.01931431231649]
We develop a framework for the average-case analysis of random quadratic problems. We derive algorithms that are optimal under this analysis. We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions.
arXiv Detail & Related papers (2020-02-12T01:44:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.