SANIA: Polyak-type Optimization Framework Leads to Scale Invariant
Stochastic Algorithms
- URL: http://arxiv.org/abs/2312.17369v1
- Date: Thu, 28 Dec 2023 21:28:08 GMT
- Title: SANIA: Polyak-type Optimization Framework Leads to Scale Invariant
Stochastic Algorithms
- Authors: Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Robert Gower,
Martin Tak\'a\v{c}
- Abstract summary: Techniques such as Adam, AdaGrad, and AdaHessian utilize a preconditioner that the search impacts by incorporating the curvature of the objective function.
This paper proposes SANIA to tackle these challenges.
- Score: 1.21748738176366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adaptive optimization methods are widely recognized as among the most popular
approaches for training Deep Neural Networks (DNNs). Techniques such as Adam,
AdaGrad, and AdaHessian utilize a preconditioner that modifies the search
direction by incorporating information about the curvature of the objective
function. However, despite their adaptive characteristics, these methods still
require manual fine-tuning of the step-size. This, in turn, impacts the time
required to solve a particular problem. This paper presents an optimization
framework named SANIA to tackle these challenges. Beyond eliminating the need
for manual step-size hyperparameter settings, SANIA incorporates techniques to
address poorly scaled or ill-conditioned problems. We also explore several
preconditioning methods, including Hutchinson's method, which approximates the
Hessian diagonal of the loss function. We conclude with an extensive empirical
examination of the proposed techniques across classification tasks, covering
both convex and non-convex contexts.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.