On dissipative symplectic integration with applications to
gradient-based optimization
- URL: http://arxiv.org/abs/2004.06840v4
- Date: Wed, 28 Apr 2021 17:24:58 GMT
- Title: On dissipative symplectic integration with applications to
gradient-based optimization
- Authors: Guilherme Fran\c{c}a, Michael I. Jordan, Ren\'e Vidal
- Abstract summary: We propose a geometric framework in which discretizations can be realized systematically.
We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, continuous-time dynamical systems have proved useful in providing
conceptual and quantitative insights into gradient-based optimization, widely
used in modern machine learning and statistics. An important question that
arises in this line of work is how to discretize the system in such a way that
its stability and rates of convergence are preserved. In this paper we propose
a geometric framework in which such discretizations can be realized
systematically, enabling the derivation of "rate-matching" algorithms without
the need for a discrete convergence analysis. More specifically, we show that a
generalization of symplectic integrators to nonconservative and in particular
dissipative Hamiltonian systems is able to preserve rates of convergence up to
a controlled error. Moreover, such methods preserve a shadow Hamiltonian
despite the absence of a conservation law, extending key results of symplectic
integrators to nonconservative cases. Our arguments rely on a combination of
backward error analysis with fundamental results from symplectic geometry. We
stress that although the original motivation for this work was the application
to optimization, where dissipative systems play a natural role, they are fully
general and not only provide a differential geometric framework for dissipative
Hamiltonian systems but also substantially extend the theory of
structure-preserving integration.
Related papers
- Structure-preserving quantum algorithms for linear and nonlinear Hamiltonian systems [0.0]
Hamiltonian systems of ordinary and partial differential equations are fundamental across modern science and engineering.
A critical property for the robustness and stability of computational methods in such systems is the symplectic structure.
We present quantum algorithms that incorporate symplectic, ensuring the preservation of this key structure.
arXiv Detail & Related papers (2024-11-06T01:36:39Z) - Structure-Preserving Learning Using Gaussian Processes and Variational
Integrators [62.31425348954686]
We propose the combination of a variational integrator for the nominal dynamics of a mechanical system and learning residual dynamics with Gaussian process regression.
We extend our approach to systems with known kinematic constraints and provide formal bounds on the prediction uncertainty.
arXiv Detail & Related papers (2021-12-10T11:09:29Z) - On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes.
We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z) - Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems.
Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z) - Towards Understanding Generalization via Decomposing Excess Risk
Dynamics [13.4379473119565]
We analyze the generalization dynamics to derive algorithm-dependent bounds, e.g., stability.
Inspired by the observation that neural networks show a slow convergence rate when fitting noise, we propose decomposing the excess risk dynamics.
Under the decomposition framework, the new bound accords better with the theoretical and empirical evidence compared to the stability-based bound and uniform convergence bound.
arXiv Detail & Related papers (2021-06-11T03:42:45Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Symplectic Gaussian Process Dynamics [5.171909600633905]
We introduce a sparse process based variational inference scheme that is able to discretize the underlying system with any explicit implicit single or multistep integrator.
In particular we discuss Hamiltonian problems coupled with symplectic producing volume preserving predictions.
arXiv Detail & Related papers (2021-02-02T17:02:55Z) - A Dynamical Systems Approach for Convergence of the Bayesian EM
Algorithm [59.99439951055238]
We show how (discrete-time) Lyapunov stability theory can serve as a powerful tool to aid, or even lead, in the analysis (and potential design) of optimization algorithms that are not necessarily gradient-based.
The particular ML problem that this paper focuses on is that of parameter estimation in an incomplete-data Bayesian framework via the popular optimization algorithm known as maximum a posteriori expectation-maximization (MAP-EM)
We show that fast convergence (linear or quadratic) is achieved, which could have been difficult to unveil without our adopted S&C approach.
arXiv Detail & Related papers (2020-06-23T01:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.