A Variational Perspective on High-Resolution ODEs
- URL: http://arxiv.org/abs/2311.02002v1
- Date: Fri, 3 Nov 2023 16:00:40 GMT
- Title: A Variational Perspective on High-Resolution ODEs
- Authors: Hoomaan Maskan, Konstantinos C. Zygalakis, Alp Yurtsever
- Abstract summary: We consider unconstrained minimization of convex functions using forced Euler-Lagrange equation.
We obtain a faster convergence rate for gradient norm minimization using Nesterov's accelerated gradient method.
- Score: 10.036727981085223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider unconstrained minimization of smooth convex functions. We propose
a novel variational perspective using forced Euler-Lagrange equation that
allows for studying high-resolution ODEs. Through this, we obtain a faster
convergence rate for gradient norm minimization using Nesterov's accelerated
gradient method. Additionally, we show that Nesterov's method can be
interpreted as a rate-matching discretization of an appropriately chosen
high-resolution ODE. Finally, using the results from the new variational
perspective, we propose a stochastic method for noisy gradients. Several
numerical experiments compare and illustrate our stochastic algorithm with
state of the art methods.
Related papers
- Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Variance reduction techniques for stochastic proximal point algorithms [5.374800961359305]
We propose the first unified study of variance reduction techniques for proximal point algorithms.
We introduce a generic proximal algorithm that can be specified to give the proximal version of SVRG, SAGA, and some of their variants for smooth and convex functions.
arXiv Detail & Related papers (2023-08-18T05:11:50Z) - First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities [91.46841922915418]
We present a unified approach for the theoretical analysis of first-order variation methods.
Our approach covers both non-linear gradient and strongly Monte Carlo problems.
We provide bounds that match the oracle strongly in the case of convex method optimization problems.
arXiv Detail & Related papers (2023-05-25T11:11:31Z) - Understanding Accelerated Gradient Methods: Lyapunov Analyses and
Hamiltonian Assisted Interpretations [1.0152838128195465]
We formulate two classes of first-order algorithms more general than previously studied for minimizing smooth and strongly convex functions.
We establish sufficient conditions, via new discrete Lyapunov analyses, for achieving accelerated convergence rates which match Nesterov's methods in the strongly and general convex settings.
We propose a novel class of discrete algorithms, called the Hamiltonian assisted gradient method, directly based on a Hamiltonian function and several interpretable operations.
arXiv Detail & Related papers (2023-04-20T03:03:30Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Nesterov Accelerated Shuffling Gradient Method for Convex Optimization [15.908060383231371]
We show that our algorithm has an improved rate of $mathcalO (1/T)$ using unified shuffling schemes.
Our convergence analysis does not require an assumption on bounded domain or a bounded gradient condition.
Numerical simulations demonstrate the efficiency of our algorithm.
arXiv Detail & Related papers (2022-02-07T21:23:17Z) - A Closed Loop Gradient Descent Algorithm applied to Rosenbrock's
function [0.0]
We introduce a novel adaptive technique for an gradient system which finds application as a gradient descent algorithm for unconstrained inertial damping.
Also using Lyapunov stability analysis, we demonstrate the performance of the continuous numerical-time version of the algorithm.
arXiv Detail & Related papers (2021-08-29T17:25:24Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Quantized Variational Inference [6.09170287691728]
We show how Quantized Variational Inference produces variance free gradients for ELBO optimization.
We show that using Quantized Variational Inference framework leads to fast convergence for both score function and reparametrized gradient.
arXiv Detail & Related papers (2020-11-04T13:22:50Z) - IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.
Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method.
When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z) - Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models.
We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.