Average-case Acceleration Through Spectral Density Estimation
- URL: http://arxiv.org/abs/2002.04756v6
- Date: Wed, 15 Dec 2021 13:18:31 GMT
- Title: Average-case Acceleration Through Spectral Density Estimation
- Authors: Fabian Pedregosa, Damien Scieur
- Abstract summary: We develop a framework for the average-case analysis of random quadratic problems.
We derive algorithms that are optimal under this analysis.
We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions.
- Score: 35.01931431231649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a framework for the average-case analysis of random quadratic
problems and derive algorithms that are optimal under this analysis. This
yields a new class of methods that achieve acceleration given a model of the
Hessian's eigenvalue distribution. We develop explicit algorithms for the
uniform, Marchenko-Pastur, and exponential distributions. These methods are
momentum-based algorithms, whose hyper-parameters can be estimated without
knowledge of the Hessian's smallest singular value, in contrast with classical
accelerated methods like Nesterov acceleration and Polyak momentum. Through
empirical benchmarks on quadratic and logistic regression problems, we identify
regimes in which the the proposed methods improve over classical (worst-case)
accelerated methods.
Related papers
- The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs)
It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques.
Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z) - A KL-based Analysis Framework with Applications to Non-Descent Optimization Methods [5.779838187603272]
We propose a novel framework for non-descent-type optimization methodologies in non-descent-type scenarios based on the Kurdyka-Lojasiewicz property.
arXiv Detail & Related papers (2024-06-04T12:49:46Z) - Robust empirical risk minimization via Newton's method [9.797319790710711]
A new variant of Newton's method for empirical risk minimization is studied.
The gradient and Hessian of the objective function are replaced by robust estimators.
An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed.
arXiv Detail & Related papers (2023-01-30T18:54:54Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - A Discrete Variational Derivation of Accelerated Methods in Optimization [68.8204255655161]
We introduce variational which allow us to derive different methods for optimization.
We derive two families of optimization methods in one-to-one correspondence.
The preservation of symplecticity of autonomous systems occurs here solely on the fibers.
arXiv Detail & Related papers (2021-06-04T20:21:53Z) - Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods.
We discuss momentum methods in detail, starting with the seminal work of Nesterov.
We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z) - Differentially Private Accelerated Optimization Algorithms [0.7874708385247353]
We present two classes of differentially private optimization algorithms.
The first algorithm is inspired by Polyak's heavy ball method.
The second class of algorithms are based on Nesterov's accelerated gradient method.
arXiv Detail & Related papers (2020-08-05T08:23:01Z) - IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.
Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method.
When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z) - Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models.
We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z) - Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP)
We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.