A Discrete Variational Derivation of Accelerated Methods in Optimization
- URL: http://arxiv.org/abs/2106.02700v1
- Date: Fri, 4 Jun 2021 20:21:53 GMT
- Title: A Discrete Variational Derivation of Accelerated Methods in Optimization
- Authors: C\'edric M. Campos, Alejandro Mahillo, David Mart\'in de Diego
- Abstract summary: We introduce variational which allow us to derive different methods for optimization.
We derive two families of optimization methods in one-to-one correspondence.
The preservation of symplecticity of autonomous systems occurs here solely on the fibers.
- Score: 68.8204255655161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many of the new developments in machine learning are connected with
gradient-based optimization methods. Recently, these methods have been studied
using a variational perspective. This has opened up the possibility of
introducing variational and symplectic integration methods using geometric
integrators. In particular, in this paper, we introduce variational integrators
which allow us to derive different methods for optimization. Using both,
Hamilton's principle and Lagrange-d'Alembert's, we derive two families of
optimization methods in one-to-one correspondence that generalize Polyak's
heavy ball and the well known Nesterov accelerated gradient method, mimicking
the behavior of the latter which reduces the oscillations of typical momentum
methods. However, since the systems considered are explicitly time-dependent,
the preservation of symplecticity of autonomous systems occurs here solely on
the fibers. Several experiments exemplify the result.
Related papers
- Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Sublinear Convergence Rates of Extragradient-Type Methods: A Survey on
Classical and Recent Developments [12.995632804090198]
The extragradient (EG) is a well-known method to approximate solutions of saddle-point problems.
Recently, these methods have gained popularity due to new applications in machine learning and robust optimization.
We provide a unified convergence analysis for different classes of algorithms, with an emphasis on sublinear best-iterate and last-iterate convergence rates.
arXiv Detail & Related papers (2023-03-30T07:04:22Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems.
Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z) - Monotonic Alpha-divergence Minimisation for Variational Inference [0.0]
We introduce a novel family of iterative algorithms which carry out $alpha$-divergence minimisation in a Variational Inference context.
They do so by ensuring a systematic decrease at each step in the $alpha$-divergence between the variational and the posterior distributions.
arXiv Detail & Related papers (2021-03-09T19:41:03Z) - Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods.
We discuss momentum methods in detail, starting with the seminal work of Nesterov.
We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z) - On dissipative symplectic integration with applications to
gradient-based optimization [77.34726150561087]
We propose a geometric framework in which discretizations can be realized systematically.
We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
arXiv Detail & Related papers (2020-04-15T00:36:49Z) - Sparse Orthogonal Variational Inference for Gaussian Processes [34.476453597078894]
We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points.
We show that this formulation recovers existing approximations and at the same time allows to obtain tighter lower bounds on the marginal likelihood and new variational inference algorithms.
arXiv Detail & Related papers (2019-10-23T15:01:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.