Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning
Problems
- URL: http://arxiv.org/abs/2305.09978v1
- Date: Wed, 17 May 2023 06:22:11 GMT
- Title: Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning
Problems
- Authors: Shigeng Sun and Yuchen Xie
- Abstract summary: We propose a novel algorithm for adaptive step length selection in the classical SGD framework.
Under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements.
We show that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.
- Score: 0.7614628596146599
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many machine learning applications and tasks rely on the stochastic gradient
descent (SGD) algorithm and its variants. Effective step length selection is
crucial for the success of these algorithms, which has motivated the
development of algorithms such as ADAM or AdaGrad. In this paper, we propose a
novel algorithm for adaptive step length selection in the classical SGD
framework, which can be readily adapted to other stochastic algorithms. Our
proposed algorithm is inspired by traditional nonlinear optimization techniques
and is supported by analytical findings. We show that under reasonable
conditions, the algorithm produces step lengths in line with well-established
theoretical requirements, and generates iterates that converge to a stationary
neighborhood of a solution in expectation. We test the proposed algorithm on
logistic regressions and deep neural networks and demonstrate that the
algorithm can generate step lengths comparable to the best step length obtained
from manual tuning.
Related papers
- Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - Genetically Modified Wolf Optimization with Stochastic Gradient Descent
for Optimising Deep Neural Networks [0.0]
This research aims to analyze an alternative approach to optimizing neural network (NN) weights, with the use of population-based metaheuristic algorithms.
A hybrid between Grey Wolf (GWO) and Genetic Modified Algorithms (GA) is explored, in conjunction with Gradient Descent (SGD)
This algorithm allows for a combination between exploitation and exploration, whilst also tackling the issue of high-dimensionality.
arXiv Detail & Related papers (2023-01-21T13:22:09Z) - Adaptive First- and Second-Order Algorithms for Large-Scale Machine
Learning [3.0204520109309843]
We consider first- and second-order techniques to address continuous optimization problems in machine learning.
In the first-order case, we propose a framework of transition from semi-deterministic to quadratic regularization methods.
In the second-order case, we propose a novel first-order algorithm with adaptive sampling and adaptive step size.
arXiv Detail & Related papers (2021-11-29T18:10:00Z) - Bolstering Stochastic Gradient Descent with Model Building [0.0]
gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates.
We propose an alternative approach to line search by using a new algorithm based on forward step model building.
We show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems.
arXiv Detail & Related papers (2021-11-13T06:54:36Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Leveraging Benchmarking Data for Informed One-Shot Dynamic Algorithm
Selection [0.9281671380673306]
A key challenge in the application of evolutionary algorithms is the selection of an algorithm instance that best suits the problem at hand.
We analyze in this work how such prior performance data can be used to infer informed dynamic algorithm selection schemes for the solution of pseudo-Boolean optimization problems.
arXiv Detail & Related papers (2021-02-12T12:27:02Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z) - Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP)
We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.