Optimal Asynchronous Stochastic Nonconvex Optimization under Heavy-Tailed Noise
- URL: http://arxiv.org/abs/2601.19379v1
- Date: Tue, 27 Jan 2026 09:04:35 GMT
- Title: Optimal Asynchronous Stochastic Nonconvex Optimization under Heavy-Tailed Noise
- Authors: Yidong Wu, Luo Luo,
- Abstract summary: We propose an asynchronous descent algorithm with momentum.<n>We show that our method achieves the optimal time under the assumption of $p$th complexity.
- Score: 21.48964748830889
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper considers the problem of asynchronous stochastic nonconvex optimization with heavy-tailed gradient noise and arbitrarily heterogeneous computation times across workers. We propose an asynchronous normalized stochastic gradient descent algorithm with momentum. The analysis show that our method achieves the optimal time complexity under the assumption of bounded $p$th-order central moment with $p\in(1,2]$. We also provide numerical experiments to show the effectiveness of proposed method.
Related papers
- Accelerated stochastic first-order method for convex optimization under heavy-tailed noise [3.5877352183559386]
We study convex composite optimization problems, where the objective function is given by the sum of a prox-friendly function and a convex function whose subgradients are estimated under heavy-tailed noise.<n>We demonstrate that a vanilla algorithm can achieve optimal complexity for these problems without additional modifications such as clipping or normalization.
arXiv Detail & Related papers (2025-10-13T17:45:05Z) - Zeroth-Order Methods for Stochastic Nonconvex Nonsmooth Composite Optimization [51.63258496873442]
Previous works on composite optimization problem requires the major part to smoothness or some machine learning examples which excludes such two approximate smoothness points respectively.<n>In this work we propose two new algorithms for such problem.<n>We show that these algorithms are effective using numerical experiments.
arXiv Detail & Related papers (2025-10-06T02:35:42Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities [91.46841922915418]
We present a unified approach for the theoretical analysis of first-order variation methods.
Our approach covers both non-linear gradient and strongly Monte Carlo problems.
We provide bounds that match the oracle strongly in the case of convex method optimization problems.
arXiv Detail & Related papers (2023-05-25T11:11:31Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Unified Convergence Analysis for Adaptive Optimization with Moving Average Estimator [75.05106948314956]
We show that an increasing large momentum parameter for the first-order moment is sufficient for adaptive scaling.<n>We also give insights for increasing the momentum in a stagewise manner in accordance with stagewise decreasing step size.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Riemannian stochastic recursive momentum method for non-convex
optimization [36.79189106909088]
We propose atildeOepsilon$-approximate gradient evaluations evaluation method for $mathcalO gradient evaluations with one iteration.
Proposed experiment demonstrate the superiority of the algorithm.
arXiv Detail & Related papers (2020-08-11T07:05:58Z) - Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
Clipping [69.9674326582747]
We propose a new accelerated first-order method called clipped-SSTM for smooth convex optimization with heavy-tailed distributed noise in gradients.
We prove new complexity that outperform state-of-the-art results in this case.
We derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.
arXiv Detail & Related papers (2020-05-21T17:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.