AEGD: Adaptive Gradient Descent with Energy
- URL: http://arxiv.org/abs/2010.05109v2
- Date: Fri, 1 Oct 2021 14:40:23 GMT
- Title: AEGD: Adaptive Gradient Descent with Energy
- Authors: Hailiang Liu and Xuping Tian
- Abstract summary: We propose AEGD, a new algorithm for first-order gradient non-energy objective functions variable.
We show energy-dependent AEGD for both non-energy convergence and desired small step size.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose AEGD, a new algorithm for first-order gradient-based optimization
of non-convex objective functions, based on a dynamically updated energy
variable. The method is shown to be unconditionally energy stable, irrespective
of the step size. We prove energy-dependent convergence rates of AEGD for both
non-convex and convex objectives, which for a suitably small step size recovers
desired convergence rates for the batch gradient descent. We also provide an
energy-dependent bound on the stationary convergence of AEGD in the stochastic
non-convex setting. The method is straightforward to implement and requires
little tuning of hyper-parameters. Experimental results demonstrate that AEGD
works well for a large variety of optimization problems: it is robust with
respect to initial data, capable of making rapid initial progress. The
stochastic AEGD shows comparable and often better generalization performance
than SGD with momentum for deep neural networks.
Related papers
- Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results [60.92029979853314]
This paper investigates Gradient Normalization without (NSGDC) its gradient reduction variant (NSGDC-VR)
We present significant improvements in the theoretical results for both algorithms.
arXiv Detail & Related papers (2024-10-21T22:40:42Z) - SGEM: stochastic gradient with energy and momentum [0.0]
We propose S, Gradient with Energy Momentum, to solve a class of general non-GEM optimization problems.
SGEM incorporates both energy and momentum so as to derive energy-dependent convergence rates.
Our results show that SGEM converges faster than AEGD and neural training.
arXiv Detail & Related papers (2022-08-03T16:45:22Z) - Formal guarantees for heuristic optimization algorithms used in machine
learning [6.978625807687497]
Gradient Descent (SGD) and its variants have become the dominant methods in the large-scale optimization machine learning (ML) problems.
We provide formal guarantees of a few convex optimization methods and proposing improved algorithms.
arXiv Detail & Related papers (2022-07-31T19:41:22Z) - An Adaptive Gradient Method with Energy and Momentum [0.0]
We introduce a novel algorithm for gradient-based optimization of objective functions.
The method is simple to implement, computationally efficient, and well suited for large-scale machine learning problems.
arXiv Detail & Related papers (2022-03-23T04:48:38Z) - Active Learning for Transition State Calculation [3.399187058548169]
transition state (TS) calculation is a grand challenge for computational intensive energy function.
To reduce the number of expensive computations of the true gradients, we propose an active learning framework.
We show that the new method significantly decreases the required number of energy or force evaluations of the original model.
arXiv Detail & Related papers (2021-08-10T13:57:31Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Bayesian Sparse learning with preconditioned stochastic gradient MCMC
and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions.
We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.