A Variational Perspective on High-Resolution ODEs
- URL: http://arxiv.org/abs/2311.02002v1
- Date: Fri, 3 Nov 2023 16:00:40 GMT
- Title: A Variational Perspective on High-Resolution ODEs
- Authors: Hoomaan Maskan, Konstantinos C. Zygalakis, Alp Yurtsever
- Abstract summary: We consider unconstrained minimization of convex functions using forced Euler-Lagrange equation.
We obtain a faster convergence rate for gradient norm minimization using Nesterov's accelerated gradient method.
- Score: 10.036727981085223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider unconstrained minimization of smooth convex functions. We propose
a novel variational perspective using forced Euler-Lagrange equation that
allows for studying high-resolution ODEs. Through this, we obtain a faster
convergence rate for gradient norm minimization using Nesterov's accelerated
gradient method. Additionally, we show that Nesterov's method can be
interpreted as a rate-matching discretization of an appropriately chosen
high-resolution ODE. Finally, using the results from the new variational
perspective, we propose a stochastic method for noisy gradients. Several
numerical experiments compare and illustrate our stochastic algorithm with
state of the art methods.
Related papers
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization [0.0]
We propose a novel algorithm that extends the methods of ball smoothing and Gaussian smoothing for noisy derivative-free optimization.
The algorithm dynamically adapts the shape of the smoothing kernel to approximate the Hessian of the objective function around a local optimum.
arXiv Detail & Related papers (2024-05-02T21:04:20Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Understanding Accelerated Gradient Methods: Lyapunov Analyses and
Hamiltonian Assisted Interpretations [1.0152838128195465]
We formulate two classes of first-order algorithms more general than previously studied for minimizing smooth and strongly convex functions.
We establish sufficient conditions, via new discrete Lyapunov analyses, for achieving accelerated convergence rates which match Nesterov's methods in the strongly and general convex settings.
We propose a novel class of discrete algorithms, called the Hamiltonian assisted gradient method, directly based on a Hamiltonian function and several interpretable operations.
arXiv Detail & Related papers (2023-04-20T03:03:30Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Nesterov Accelerated Shuffling Gradient Method for Convex Optimization [15.908060383231371]
We show that our algorithm has an improved rate of $mathcalO (1/T)$ using unified shuffling schemes.
Our convergence analysis does not require an assumption on bounded domain or a bounded gradient condition.
Numerical simulations demonstrate the efficiency of our algorithm.
arXiv Detail & Related papers (2022-02-07T21:23:17Z) - A Closed Loop Gradient Descent Algorithm applied to Rosenbrock's
function [0.0]
We introduce a novel adaptive technique for an gradient system which finds application as a gradient descent algorithm for unconstrained inertial damping.
Also using Lyapunov stability analysis, we demonstrate the performance of the continuous numerical-time version of the algorithm.
arXiv Detail & Related papers (2021-08-29T17:25:24Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Quantized Variational Inference [6.09170287691728]
We show how Quantized Variational Inference produces variance free gradients for ELBO optimization.
We show that using Quantized Variational Inference framework leads to fast convergence for both score function and reparametrized gradient.
arXiv Detail & Related papers (2020-11-04T13:22:50Z) - IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.
Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method.
When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z) - Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models.
We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.