Related papers: Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning Problems

Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning Problems

URL: http://arxiv.org/abs/2305.09978v1
Date: Wed, 17 May 2023 06:22:11 GMT
Title: Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning Problems
Authors: Shigeng Sun and Yuchen Xie
Abstract summary: We propose a novel algorithm for adaptive step length selection in the classical SGD framework. Under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements. We show that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.
Score: 0.7614628596146599
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many machine learning applications and tasks rely on the stochastic gradient descent (SGD) algorithm and its variants. Effective step length selection is crucial for the success of these algorithms, which has motivated the development of algorithms such as ADAM or AdaGrad. In this paper, we propose a novel algorithm for adaptive step length selection in the classical SGD framework, which can be readily adapted to other stochastic algorithms. Our proposed algorithm is inspired by traditional nonlinear optimization techniques and is supported by analytical findings. We show that under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements, and generates iterates that converge to a stationary neighborhood of a solution in expectation. We test the proposed algorithm on logistic regressions and deep neural networks and demonstrate that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.

Related papers

A Robust Algorithm for Non-IID Machine Learning Problems with Convergence Analysis [2.4462606119036456]
We propose an improved numerical algorithm for solving minimax problems based on nonsmooth optimization, quadratic programming and iterative process.<n>Such an algorithm can be widely applied in various fields such as robust optimization, imbalanced learning, etc.
arXiv Detail & Related papers (2025-07-01T14:41:59Z)
Preference-Based Gradient Estimation for ML-Guided Approximate Combinatorial Optimization [15.102119312523696]
Combinatorial optimization (CO) problems arise across a broad spectrum of domains, including medicine, logistics, and manufacturing.<n>We propose a learning-based approach that enhances existing non-learned approximation algorithms for CO.<n>Our method is trained end-to-end in a self-supervised fashion, using a novel gradient estimation scheme that treats the approximation algorithm as a black box.
arXiv Detail & Related papers (2025-02-26T18:23:07Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework. We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z)
Genetically Modified Wolf Optimization with Stochastic Gradient Descent for Optimising Deep Neural Networks [0.0]
This research aims to analyze an alternative approach to optimizing neural network (NN) weights, with the use of population-based metaheuristic algorithms. A hybrid between Grey Wolf (GWO) and Genetic Modified Algorithms (GA) is explored, in conjunction with Gradient Descent (SGD) This algorithm allows for a combination between exploitation and exploration, whilst also tackling the issue of high-dimensionality.
arXiv Detail & Related papers (2023-01-21T13:22:09Z)
Adaptive First- and Second-Order Algorithms for Large-Scale Machine Learning [3.0204520109309843]
We consider first- and second-order techniques to address continuous optimization problems in machine learning. In the first-order case, we propose a framework of transition from semi-deterministic to quadratic regularization methods. In the second-order case, we propose a novel first-order algorithm with adaptive sampling and adaptive step size.
arXiv Detail & Related papers (2021-11-29T18:10:00Z)
Bolstering Stochastic Gradient Descent with Model Building [0.0]
gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates. We propose an alternative approach to line search by using a new algorithm based on forward step model building. We show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems.
arXiv Detail & Related papers (2021-11-13T06:54:36Z)
Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications. We propose two new algorithms for bilevel optimization. We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z)
Leveraging Benchmarking Data for Informed One-Shot Dynamic Algorithm Selection [0.9281671380673306]
A key challenge in the application of evolutionary algorithms is the selection of an algorithm instance that best suits the problem at hand. We analyze in this work how such prior performance data can be used to infer informed dynamic algorithm selection schemes for the solution of pseudo-Boolean optimization problems.
arXiv Detail & Related papers (2021-02-12T12:27:02Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Run2Survive: A Decision-theoretic Approach to Algorithm Selection based on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime. We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive. In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z)
Active Model Estimation in Markov Decision Processes [108.46146218973189]
We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP) We show that our Markov-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime.
arXiv Detail & Related papers (2020-03-06T16:17:24Z)
Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory. Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.