Related papers: Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness

Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness

URL: http://arxiv.org/abs/2511.02276v1
Date: Tue, 04 Nov 2025 05:27:57 GMT
Title: Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness
Authors: Yuheng Zhao, Yu-Hu Yan, Kfir Yehuda Levy, Peng Zhao,
Abstract summary: We investigate online learning with H"older smooth functions, a general class encompassing both smooth and non-smooth functions.<n>We design the corresponding gradient-variation online learning algorithm whose regret smoothly interpolates between the optimal guarantees in smooth and non-smooth regimes.<n>We address this by integrating online adaptivity with a detection-based guess-and-check procedure, which, for the first time, yields a universal offline method.
Score: 31.293577718154225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Smoothness is known to be crucial for acceleration in offline optimization, and for gradient-variation regret minimization in online learning. Interestingly, these two problems are actually closely connected -- accelerated optimization can be understood through the lens of gradient-variation online learning. In this paper, we investigate online learning with H\"older smooth functions, a general class encompassing both smooth and non-smooth (Lipschitz) functions, and explore its implications for offline optimization. For (strongly) convex online functions, we design the corresponding gradient-variation online learning algorithm whose regret smoothly interpolates between the optimal guarantees in smooth and non-smooth regimes. Notably, our algorithms do not require prior knowledge of the H\"older smoothness parameter, exhibiting strong adaptivity over existing methods. Through online-to-batch conversion, this gradient-variation online adaptivity yields an optimal universal method for stochastic convex optimization under H\"older smoothness. However, achieving universality in offline strongly convex optimization is more challenging. We address this by integrating online adaptivity with a detection-based guess-and-check procedure, which, for the first time, yields a universal offline method that achieves accelerated convergence in the smooth regime while maintaining near-optimal convergence in the non-smooth one.

Related papers

Adaptive Matrix Online Learning through Smoothing with Guarantees for Nonsmooth Nonconvex Optimization [54.723834588133165]
We study online linear optimization with matrix variables by the operatorAML, a setting where the geometry renders designing datadependent and efficient adaptive algorithms challenging.<n>We instantiate this framework with two efficient methods that avoid projections.<n>We show both methods admit closed-form updates match one-sided Shampoo's regret up to a constant factor, while significantly reducing computational cost.
arXiv Detail & Related papers (2026-02-09T03:09:47Z)
A Tale of Two Geometries: Adaptive Optimizers and Non-Euclidean Descent [12.958474356193427]
Adaptive adaptings can reduce to normalized steepest descent (NSD) when only gradient to the current is used.<n>In the convex setting adaptives are governed by a stronger adaptive smoothness condition.<n>We show that adaptive smoothness enables acceleration of adaptives with Nesterov's convex setting.
arXiv Detail & Related papers (2025-11-25T18:13:53Z)
Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality [61.14065222161553]
We study offline convex optimization with smooth objectives, where the classical Nesterov's Accelerated Gradient method achieves the optimal convergence.<n>We propose novel optimistic online-to-batch conversions that incorporate optimism theoretically into the analysis.<n>We achieve the optimal accelerated convergence rate for strongly convex and smooth objectives for the first time through the lens of online-to-batch conversion.
arXiv Detail & Related papers (2025-11-10T01:07:51Z)
Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions. Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms. We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z)
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization [55.115992622028685]
Previous work suggests that first-order methods would need to trade-off convergence rate (gradient convergence rate) for better. We demonstrate that both optimal complexity and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems.
arXiv Detail & Related papers (2023-10-26T19:56:52Z)
A simple uniformly optimal method without line search for convex optimization [9.280355951055865]
We show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori. We present a novel accelerated gradient descent type algorithm called AC-FGM that can achieve an optimal $mathcalO (1/k2)$ rate of convergence for smooth convex optimization.
arXiv Detail & Related papers (2023-10-16T05:26:03Z)
ELRA: Exponential learning rate adaption gradient descent optimization method [83.88591755871734]
We present a novel, fast (exponential rate), ab initio (hyper-free) gradient based adaption. The main idea of the method is to adapt the $alpha by situational awareness. It can be applied to problems of any dimensions n and scales only linearly.
arXiv Detail & Related papers (2023-09-12T14:36:13Z)
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities. Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z)
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities [32.51470158863247]
Main algorithms AdaACSA and AdaAGD+ are accelerated methods for constrained convex optimization. We complement them with a simpler algorithm AdaGrad+ which enjoys the same features, and achieves the standard non-accelerated convergence rate.
arXiv Detail & Related papers (2020-07-17T09:10:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.