Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time
- URL: http://arxiv.org/abs/2505.12553v1
- Date: Sun, 18 May 2025 21:45:59 GMT
- Title: Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time
- Authors: Qiang Fu, Andre Wibisono,
- Abstract summary: We study the Hamiltonian flow for optimization (HF-opt), which simulates the Hamiltonian dynamics for some integration time.<n>We show that by randomizing the integration time in HF-opt, the resulting randomized Hamiltonian flow (RHF) achieves accelerated convergence rates in continuous time.
- Score: 16.347507752603132
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the Hamiltonian flow for optimization (HF-opt), which simulates the Hamiltonian dynamics for some integration time and resets the velocity to $0$ to decrease the objective function; this is the optimization analogue of the Hamiltonian Monte Carlo algorithm for sampling. For short integration time, HF-opt has the same convergence rates as gradient descent for minimizing strongly and weakly convex functions. We show that by randomizing the integration time in HF-opt, the resulting randomized Hamiltonian flow (RHF) achieves accelerated convergence rates in continuous time, similar to the rates for the accelerated gradient flow. We study a discrete-time implementation of RHF as the randomized Hamiltonian gradient descent (RHGD) algorithm. We prove that RHGD achieves the same accelerated convergence rates as Nesterov's accelerated gradient descent (AGD) for minimizing smooth strongly and weakly convex functions. We provide numerical experiments to demonstrate that RHGD is competitive with classical accelerated methods such as AGD across all settings and outperforms them in certain regimes.
Related papers
- Least-Squares-Embedded Optimization for Accelerated Convergence of PINNs in Acoustic Wavefield Simulations [2.8948274245812327]
PINNs have shown promise in solving partial differential equations.<n>For scattered acoustic wavefield simulation based on Helmholtz equation, we derive a hybrid optimization framework.<n>This framework accelerates training convergence by embedding a least-squares (LS) solver directly into the GD loss function.
arXiv Detail & Related papers (2025-04-23T09:32:14Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Self-Tuning Hamiltonian Monte Carlo for Accelerated Sampling [12.163119957680802]
Hamiltonian Monte Carlo simulations crucially depend on the integration timestep and the number of integration steps.
We present an adaptive general-purpose framework to automatically tune such parameters.
We show that a good correspondence between loss and autocorrelation time can be established.
arXiv Detail & Related papers (2023-09-24T09:35:25Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization [108.35402316802765]
We propose a new first-order optimization algorithm -- AcceleratedGradient-OptimisticGradient (AG-OG) Ascent.
We show that AG-OG achieves the optimal convergence rate (up to a constant) for a variety of settings.
We extend our algorithm to extend the setting and achieve the optimal convergence rate in both bi-SC-SC and bi-C-SC settings.
arXiv Detail & Related papers (2022-10-31T17:59:29Z) - Losing momentum in continuous-time stochastic optimisation [42.617042045455506]
momentum-based optimisation algorithms have become particularly widespread.
In this work, we analyse a continuous-time model for gradient descent with momentum.
We also train a convolutional neural network in an image classification problem.
arXiv Detail & Related papers (2022-09-08T10:46:05Z) - Nesterov Accelerated ADMM for Fast Diffeomorphic Image Registration [63.15453821022452]
Recent developments in approaches based on deep learning have achieved sub-second runtimes for DiffIR.
We propose a simple iterative scheme that functionally composes intermediate non-stationary velocity fields.
We then propose a convex optimisation model that uses a regularisation term of arbitrary order to impose smoothness on these velocity fields.
arXiv Detail & Related papers (2021-09-26T19:56:45Z) - Improving the Transient Times for Distributed Stochastic Gradient
Methods [5.215491794707911]
We study a distributed gradient algorithm, called exact diffusion adaptive stepsizes (EDAS)
We show EDAS achieves the same network independent convergence rate as centralized gradient descent (SGD)
To the best of our knowledge, EDAS achieves the shortest time when the average of the $n$ cost functions is strongly convex.
arXiv Detail & Related papers (2021-05-11T08:09:31Z) - Fast and differentiable simulation of driven quantum systems [58.720142291102135]
We introduce a semi-analytic method based on the Dyson expansion that allows us to time-evolve driven quantum systems much faster than standard numerical methods.
We show results of the optimization of a two-qubit gate using transmon qubits in the circuit QED architecture.
arXiv Detail & Related papers (2020-12-16T21:43:38Z) - Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed.
This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z) - Adaptive Sampling Distributed Stochastic Variance Reduced Gradient for
Heterogeneous Distributed Datasets [14.945821529085896]
We study distributed optimization algorithms for minimizing the average of emphterogeneous functions with a focus on communication efficiency.
We propose a novel emphadaptive sampling of machines specially catered to these settings.
We show that the new way improves the dependence of convergence rate from maximum Lipschitz constant to emphaverage Lipschitz constant across machines.
arXiv Detail & Related papers (2020-02-20T01:55:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.