Whiplash Gradient Descent Dynamics
- URL: http://arxiv.org/abs/2203.02140v4
- Date: Mon, 19 Jun 2023 12:13:50 GMT
- Title: Whiplash Gradient Descent Dynamics
- Authors: Subhransu S. Bhattacharjee and Ian R. Petersen
- Abstract summary: We introduce the symplectic convergence analysis for the Whiplash system for convex functions.
We study the algorithm's performance for various costs and provide a practical methodology for analyzing convergence rates.
- Score: 2.0508733018954843
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this paper, we propose the Whiplash Inertial Gradient dynamics, a
closed-loop optimization method that utilises gradient information, to find the
minima of a cost function in finite-dimensional settings. We introduce the
symplectic asymptotic convergence analysis for the Whiplash system for convex
functions. We also introduce relaxation sequences to explain the non-classical
nature of the algorithm and an exploring heuristic variant of the Whiplash
algorithm to escape saddle points, deterministically. We study the algorithm's
performance for various costs and provide a practical methodology for analyzing
convergence rates using integral constraint bounds and a novel Lyapunov rate
method. Our results demonstrate polynomial and exponential rates of convergence
for quadratic cost functions.
Related papers
- Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization [0.0]
We propose a novel algorithm that extends the methods of ball smoothing and Gaussian smoothing for noisy derivative-free optimization.
The algorithm dynamically adapts the shape of the smoothing kernel to approximate the Hessian of the objective function around a local optimum.
arXiv Detail & Related papers (2024-05-02T21:04:20Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Shuffling Momentum Gradient Algorithm for Convex Optimization [22.58278411628813]
TheTranart Gradient Descent method (SGD) and its variants have become methods choice for solving finite-sum optimization problems from machine learning and data science.
We provide the first analysis of shuffling momentum-based methods for the strongly setting.
arXiv Detail & Related papers (2024-03-05T18:19:02Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random
Perturbations for Stochastic Optimization [10.820943271350442]
We present a convex gradient algorithm for minimizing a smooth objective function that is an expectation over noisy cost samples.
We also show that our algorithm avoids the ratelibria, implying convergence to local minima.
arXiv Detail & Related papers (2022-07-30T18:50:36Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - A Closed Loop Gradient Descent Algorithm applied to Rosenbrock's
function [0.0]
We introduce a novel adaptive technique for an gradient system which finds application as a gradient descent algorithm for unconstrained inertial damping.
Also using Lyapunov stability analysis, we demonstrate the performance of the continuous numerical-time version of the algorithm.
arXiv Detail & Related papers (2021-08-29T17:25:24Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Quantized Variational Inference [6.09170287691728]
We show how Quantized Variational Inference produces variance free gradients for ELBO optimization.
We show that using Quantized Variational Inference framework leads to fast convergence for both score function and reparametrized gradient.
arXiv Detail & Related papers (2020-11-04T13:22:50Z) - Conditional gradient methods for stochastically constrained convex
minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems.
The most important feature of our framework is that only a subset of the constraints is processed at each iteration.
Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.