Convergence and Stability of the Stochastic Proximal Point Algorithm
with Momentum
- URL: http://arxiv.org/abs/2111.06171v5
- Date: Tue, 27 Jun 2023 02:29:46 GMT
- Title: Convergence and Stability of the Stochastic Proximal Point Algorithm
with Momentum
- Authors: Junhyung Lyle Kim, Panos Toulis, Anastasios Kyrillidis
- Abstract summary: We show how a gradient proximal algorithm with momentum (PPA) allows faster convergence to a neighborhood compared to the proximal algorithm (PPA) with better contraction factor.
- Score: 14.158845925610438
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic gradient descent with momentum (SGDM) is the dominant algorithm in
many optimization scenarios, including convex optimization instances and
non-convex neural network training. Yet, in the stochastic setting, momentum
interferes with gradient noise, often leading to specific step size and
momentum choices in order to guarantee convergence, set aside acceleration.
Proximal point methods, on the other hand, have gained much attention due to
their numerical stability and elasticity against imperfect tuning. Their
stochastic accelerated variants though have received limited attention: how
momentum interacts with the stability of (stochastic) proximal point methods
remains largely unstudied. To address this, we focus on the convergence and
stability of the stochastic proximal point algorithm with momentum (SPPAM), and
show that SPPAM allows a faster linear convergence to a neighborhood compared
to the stochastic proximal point algorithm (SPPA) with a better contraction
factor, under proper hyperparameter tuning. In terms of stability, we show that
SPPAM depends on problem constants more favorably than SGDM, allowing a wider
range of step size and momentum that lead to convergence.
Related papers
- A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise [17.493808856903303]
One fundamental challenge in analyzing an approximation algorithm is to establish its stability.
In this paper, we extend the celebrated Borkar-Meyn theorem for stability bounded from the Martingale difference noise setting to the Markovian noise setting.
Central to our analysis is the diminishing rate of change of a few functions, which is implied by both a form of strong law of large numbers and a commonly used V4 Lynov drift condition.
arXiv Detail & Related papers (2024-01-15T17:20:17Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Last-iterate convergence analysis of stochastic momentum methods for
neural networks [3.57214198937538]
The momentum method is used to solve large-scale optimization problems in neural networks.
Current convergence results of momentum methods under artificial settings.
The momentum factors can be fixed to be constant, rather than in existing time.
arXiv Detail & Related papers (2022-05-30T02:17:44Z) - A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction [2.048226951354646]
We develop an implementable proximal point (SPP) method for a class of weakly convex, composite optimization problems.
The proposed algorithm incorporates a variance reduction mechanism and the resulting updates are solved using an inexact semismooth Newton framework.
arXiv Detail & Related papers (2022-04-01T13:08:49Z) - Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent
Flows [4.817429789586127]
We introduce a Poly-based optimization framework for achieving acceleration, based on the notion of fixed-time stability dynamical systems.
We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms.
arXiv Detail & Related papers (2021-12-02T16:04:40Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Stochastic optimization with momentum: convergence, fluctuations, and
traps avoidance [0.0]
In this paper, a general optimization procedure is studied, unifying several variants of the gradient descent such as, among others, the heavy ball method, the Nesterov Accelerated Gradient (S-NAG), and the widely used Adam method.
The avoidance is studied as a noisy discretization of a non-autonomous ordinary differential equation.
arXiv Detail & Related papers (2020-12-07T19:14:49Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z) - Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic
Perspectives [97.16266088683061]
The article rigorously establishes why symplectic discretization schemes are important for momentum-based optimization algorithms.
It provides a characterization of algorithms that exhibit accelerated convergence.
arXiv Detail & Related papers (2020-02-28T00:32:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.