Related papers: Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models

Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models

URL: http://arxiv.org/abs/2408.03199v2
Date: Wed, 11 Jun 2025 07:48:15 GMT
Title: Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models
Authors: Matteo Lapucci, Davide Pucci,
Abstract summary: We focus on approaches based on PLetrized line searches and employing general search directions.<n>We shed light on the additional property of directions needed to prove fast convergence of the general class of algorithms.<n>It could be of interest to integrate line searches within momentum, conjugate gradient or adaptive preconditioning methods.
Score: 0.5156484100374059
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we deal with algorithms to solve the finite-sum problems related to fitting over-parametrized models, that typically satisfy the interpolation condition. In particular, we focus on approaches based on stochastic line searches and employing general search directions. We define conditions on the sequence of search directions that guarantee finite termination and bounds for the backtracking procedure. Moreover, we shed light on the additional property of directions needed to prove fast (linear) convergence of the general class of algorithms when applied to PL functions in the interpolation regime. From the point of view of algorithms design, the proposed analysis identifies safeguarding conditions that could be employed in relevant algorithmic frameworks. In particular, it could be of interest to integrate stochastic line searches within momentum, conjugate gradient or adaptive preconditioning methods.

Related papers

Approximate Counting in Local Lemma Regimes [0.0]
We consider the probability of intersection of events and the dimension of intersection of subspaces.<n>For general projectors, we provide two algorithms: a fully-time approximation scheme under a global inclusion-stability condition, and an efficient affine approximation under a spectral gap assumption.
arXiv Detail & Related papers (2025-12-10T22:26:36Z)
A Non-Asymptotic Theory of Seminorm Lyapunov Stability: From Deterministic to Stochastic Iterative Algorithms [15.764613607477887]
We study the problem of solving fixed-point equations for seminorm-contractive operators. We establish the non-asymptotic behavior of iterative algorithms in both deterministic and foundational settings.
arXiv Detail & Related papers (2025-02-20T02:39:37Z)
Effectively Leveraging Momentum Terms in Stochastic Line Search Frameworks for Fast Optimization of Finite-Sum Problems [0.5156484100374059]
We explore the relationship between recent line search approaches for deep optimization in the overparametrized regime and momentum directions. We introduce algorithmic that exploits a mix of data persistency, conjugateient type rules for the definition of the momentum parameter. The resulting algorithm is empirically shown to outperform other popular methods.
arXiv Detail & Related papers (2024-11-11T16:26:33Z)
Convergence of Expectation-Maximization Algorithm with Mixed-Integer Optimization [5.319361976450982]
This paper introduces a set of conditions that ensure the convergence of a specific class of EM algorithms. Our results offer a new analysis technique for iterative algorithms that solve mixed-integer non-linear optimization problems.
arXiv Detail & Related papers (2024-01-31T11:42:46Z)
A unified consensus-based parallel ADMM algorithm for high-dimensional regression with combined regularizations [3.280169909938912]
parallel alternating multipliers (ADMM) is widely recognized for its effectiveness in handling large-scale distributed datasets. The proposed algorithms serve to demonstrate the reliability, stability, and scalability of a financial example.
arXiv Detail & Related papers (2023-11-21T03:30:38Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms. With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem. We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z)
Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order. We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z)
A Stochastic Sequential Quadratic Optimization Algorithm for Nonlinear Equality Constrained Optimization with Rank-Deficient Jacobians [11.03311584463036]
A sequential quadratic optimization algorithm is proposed for solving smooth nonlinear equality constrained optimization problems. Results of numerical experiments demonstrate that the algorithm offers superior performance when compared to popular alternatives.
arXiv Detail & Related papers (2021-06-24T13:46:52Z)
Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning. Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms. We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z)
A Dynamical Systems Approach for Convergence of the Bayesian EM Algorithm [59.99439951055238]
We show how (discrete-time) Lyapunov stability theory can serve as a powerful tool to aid, or even lead, in the analysis (and potential design) of optimization algorithms that are not necessarily gradient-based. The particular ML problem that this paper focuses on is that of parameter estimation in an incomplete-data Bayesian framework via the popular optimization algorithm known as maximum a posteriori expectation-maximization (MAP-EM) We show that fast convergence (linear or quadratic) is achieved, which could have been difficult to unveil without our adopted S&C approach.
arXiv Detail & Related papers (2020-06-23T01:34:18Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives [97.16266088683061]
The article rigorously establishes why symplectic discretization schemes are important for momentum-based optimization algorithms. It provides a characterization of algorithms that exhibit accelerated convergence.
arXiv Detail & Related papers (2020-02-28T00:32:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.