Stochastic Difference-of-Convex Optimization with Momentum
- URL: http://arxiv.org/abs/2510.17503v1
- Date: Mon, 20 Oct 2025 13:00:32 GMT
- Title: Stochastic Difference-of-Convex Optimization with Momentum
- Authors: El Mahdi Chayti, Martin Jaggi,
- Abstract summary: We show that momentum convergence under smoothness enables batch convergence assumptions (of the concave part) for any size.<n>Our momentum-based algorithm achieves provable convergence step by step.
- Score: 43.7624716532625
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic difference-of-convex (DC) optimization is prevalent in numerous machine learning applications, yet its convergence properties under small batch sizes remain poorly understood. Existing methods typically require large batches or strong noise assumptions, which limit their practical use. In this work, we show that momentum enables convergence under standard smoothness and bounded variance assumptions (of the concave part) for any batch size. We prove that without momentum, convergence may fail regardless of stepsize, highlighting its necessity. Our momentum-based algorithm achieves provable convergence and demonstrates strong empirical performance.
Related papers
- The Implicit Bias of Steepest Descent with Mini-batch Stochastic Gradient [32.97211471008323]
We study how batch size, momentum, and variance reduction shape the limiting max-margin behavior and convergence rates.<n>We show that without momentum, convergence only occurs with large batches, yielding a batch-dependent margin gap but the full-batch convergence rate.
arXiv Detail & Related papers (2026-02-12T04:25:38Z) - Revisiting Convergence: Shuffling Complexity Beyond Lipschitz Smoothness [50.78508362183774]
Shuffling-type gradient methods are favored in practice for their simplicity and rapid empirical performance.<n>Most require the Lipschitz condition, which is often not met in common machine learning schemes.
arXiv Detail & Related papers (2025-07-11T15:36:48Z) - A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand.<n>We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - On the fast convergence of minibatch heavy ball momentum [5.4755933832880865]
We show that heavy ball momentum retains the fast linear rate of (deterministic) heavy ball momentum on optimization problems.<n>The algorithm we study can be interpreted as an accelerated randomized Kaczmarz algorithm with minibatching and heavy ball momentum.
arXiv Detail & Related papers (2022-06-15T14:12:45Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z) - Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth
Non-Convex Optimization [25.680334940504405]
This paper establishes the convergence of the rate of a non-smooth subient method with momentum for constrained problems.
For problems, we show how the unconstrained case can be analyzed under weaker assumptions than the state-of-the-art.
arXiv Detail & Related papers (2020-02-13T12:10:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.