Gradient Descent Algorithm Survey
- URL: http://arxiv.org/abs/2511.20725v1
- Date: Tue, 25 Nov 2025 09:30:44 GMT
- Title: Gradient Descent Algorithm Survey
- Authors: Deng Fucheng, Wang Wanjie, Gong Ao, Wang Xiaoqi, Wang Fan,
- Abstract summary: This article concentrates on five major algorithms: SGD, Mini-batch SGD, Momentum, Adam, and Lion.<n>It systematically analyzes the core advantages, limitations, and key practical recommendations of each algorithm.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Focusing on the practical configuration needs of optimization algorithms in deep learning, this article concentrates on five major algorithms: SGD, Mini-batch SGD, Momentum, Adam, and Lion. It systematically analyzes the core advantages, limitations, and key practical recommendations of each algorithm. The research aims to gain an in-depth understanding of these algorithms and provide a standardized reference for the reasonable selection, parameter tuning, and performance improvement of optimization algorithms in both academic research and engineering practice, helping to solve optimization challenges in different scales of models and various training scenarios.
Related papers
- Training Neural Networks at Any Scale [57.048948400182354]
This article reviews modern optimization methods for training neural networks with an emphasis on efficiency and scale.<n>We present state-of-the-art optimization algorithms under a unified algorithmic template that highlights the importance of adapting to the structures in the problem.
arXiv Detail & Related papers (2025-11-14T10:58:07Z) - Deep Reinforcement Learning for Dynamic Algorithm Selection: A
Proof-of-Principle Study on Differential Evolution [27.607740475924448]
We propose a deep reinforcement learning-based dynamic algorithm selection framework to accomplish this task.
We employ a sophisticated deep neural network model to infer the optimal action, ensuring informed algorithm selections.
As a proof-of-principle study, we apply this framework to a group of Differential Evolution algorithms.
arXiv Detail & Related papers (2024-03-04T15:40:28Z) - Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning
Problems [0.7614628596146599]
We propose a novel algorithm for adaptive step length selection in the classical SGD framework.
Under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements.
We show that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.
arXiv Detail & Related papers (2023-05-17T06:22:11Z) - Neural Combinatorial Optimization: a New Player in the Field [69.23334811890919]
This paper presents a critical analysis on the incorporation of algorithms based on neural networks into the classical optimization framework.
A comprehensive study is carried out to analyse the fundamental aspects of such algorithms, including performance, transferability, computational cost and to larger-sized instances.
arXiv Detail & Related papers (2022-05-03T07:54:56Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - A survey on multi-objective hyperparameter optimization algorithms for
Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms.
We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both.
We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z) - An Overview and Experimental Study of Learning-based Optimization
Algorithms for Vehicle Routing Problem [49.04543375851723]
Vehicle routing problem (VRP) is a typical discrete optimization problem.
Many studies consider learning-based optimization algorithms to solve VRP.
This paper reviews recent advances in this field and divides relevant approaches into end-to-end approaches and step-by-step approaches.
arXiv Detail & Related papers (2021-07-15T02:13:03Z) - Identifying Co-Adaptation of Algorithmic and Implementational
Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of
Inference-based Algorithms [15.338931971492288]
We focus on a series of inference-based actor-critic algorithms to decouple their algorithmic innovations and implementation decisions.
We identify substantial performance drops whenever implementation details are mismatched for algorithmic choices.
Results show which implementation details are co-adapted and co-evolved with algorithms.
arXiv Detail & Related papers (2021-03-31T17:55:20Z) - PAMELI: A Meta-Algorithm for Computationally Expensive Multi-Objective
Optimization Problems [0.0]
The proposed algorithm is based on solving a set of surrogate problems defined by models of the real one.
Our algorithm also performs a meta-search for optimal surrogate models and navigation strategies for the optimization landscape.
arXiv Detail & Related papers (2021-03-19T11:18:03Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.