GeoAdaLer: Geometric Insights into Adaptive Stochastic Gradient Descent Algorithms
- URL: http://arxiv.org/abs/2405.16255v1
- Date: Sat, 25 May 2024 14:36:33 GMT
- Title: GeoAdaLer: Geometric Insights into Adaptive Stochastic Gradient Descent Algorithms
- Authors: Chinedu Eleh, Masuzyo Mwanza, Ekene Aguegboh, Hans-Werner van Wyk,
- Abstract summary: We introduce GeoAdaLer (Geometric Adaptive Learner), a novel adaptive learning method for gradient descent optimization.
The proposed method extends the concept of adaptive learning by introducing a geometrically inclined approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Adam optimization method has achieved remarkable success in addressing contemporary challenges in stochastic optimization. This method falls within the realm of adaptive sub-gradient techniques, yet the underlying geometric principles guiding its performance have remained shrouded in mystery, and have long confounded researchers. In this paper, we introduce GeoAdaLer (Geometric Adaptive Learner), a novel adaptive learning method for stochastic gradient descent optimization, which draws from the geometric properties of the optimization landscape. Beyond emerging as a formidable contender, the proposed method extends the concept of adaptive learning by introducing a geometrically inclined approach that enhances the interpretability and effectiveness in complex optimization scenarios
Related papers
- Optimization Methods in Deep Learning: A Comprehensive Overview [0.0]
Deep learning has achieved remarkable success in various fields such as image recognition, natural language processing, and speech recognition.
The effectiveness of deep learning largely depends on the optimization methods used to train deep neural networks.
We provide an overview of first-order optimization methods such as Gradient Descent, Adagrad, Adadelta, and RMSprop, as well as recent momentum-based and adaptive gradient methods such as Nesterov accelerated gradient, Adam, Nadam, AdaMax, and AMSGrad.
arXiv Detail & Related papers (2023-02-19T13:01:53Z) - BFE and AdaBFE: A New Approach in Learning Rate Automation for
Stochastic Optimization [3.541406632811038]
gradient-based optimization approach by automatically adjusting the learning rate is proposed.
This approach could be an alternative method to optimize the learning rate based on the gradient descent (SGD) algorithm.
arXiv Detail & Related papers (2022-07-06T15:55:53Z) - Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive
Step Size [29.15132344744801]
We establish local convergence for gradient descent with adaptive step size for problems such as matrix inversion.
We show that these first order optimization methods can achieve sub-linear or linear convergence.
arXiv Detail & Related papers (2021-12-30T00:50:30Z) - Private Adaptive Gradient Methods for Convex Optimization [32.3523019355048]
We propose and analyze differentially private variants of a Gradient Descent (SGD) algorithm with adaptive stepsizes.
We provide upper bounds on the regret of both algorithms and show that the bounds are (worst-case) optimal.
arXiv Detail & Related papers (2021-06-25T16:46:45Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods [7.486132958737807]
We present an adaptive variance reduced method with an implicit approach for adaptivity.
We provide convergence guarantees for finite-sum minimization problems and show a faster convergence than SARAH can be achieved if local geometry permits.
This algorithm implicitly computes step-size and efficiently estimates local Lipschitz smoothness of functions.
arXiv Detail & Related papers (2021-02-19T01:17:15Z) - EOS: a Parallel, Self-Adaptive, Multi-Population Evolutionary Algorithm
for Constrained Global Optimization [68.8204255655161]
EOS is a global optimization algorithm for constrained and unconstrained problems of real-valued variables.
It implements a number of improvements to the well-known Differential Evolution (DE) algorithm.
Results prove that EOSis capable of achieving increased performance compared to state-of-the-art single-population self-adaptive DE algorithms.
arXiv Detail & Related papers (2020-07-09T10:19:22Z) - Disentangling Adaptive Gradient Methods from Learning Rates [65.0397050979662]
We take a deeper look at how adaptive gradient methods interact with the learning rate schedule.
We introduce a "grafting" experiment which decouples an update's magnitude from its direction.
We present some empirical and theoretical retrospectives on the generalization of adaptive gradient methods.
arXiv Detail & Related papers (2020-02-26T21:42:49Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z) - On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization [80.03647903934723]
We prove adaptive gradient methods in expectation of gradient convergence methods.
Our analyses shed light on better adaptive gradient methods in optimizing non understanding gradient bounds.
arXiv Detail & Related papers (2018-08-16T20:25:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.