Related papers: Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations

Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations

URL: http://arxiv.org/abs/2505.21095v1
Date: Tue, 27 May 2025 12:22:21 GMT
Title: Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations
Authors: Kei Takemura, Ryuta Matsuno, Keita Sakuma,
Abstract summary: A central goal in online learning is to achieve adaptivity to unknown problem characteristics.<n>We propose a novel optimistic online mirror descent algorithm with an auxiliary initial round using large learning rates.<n>We develop the first UOL algorithm that simultaneously achieves state-of-the-art GV bounds and LA under standard assumptions.
Score: 1.9897061813159418
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: A central goal in online learning is to achieve adaptivity to unknown problem characteristics, such as environmental changes captured by gradient variation (GV), function curvature (universal online learning, UOL), and gradient scales (Lipschitz adaptivity, LA). Simultaneously achieving these with optimal performance is a major challenge, partly due to limitations in algorithms for prediction with expert advice. These algorithms often serve as meta-algorithms in online ensemble frameworks, and their sub-optimality hinders overall UOL performance. Specifically, existing algorithms addressing the ``impossible tuning'' issue incur an excess $\sqrt{\log T}$ factor in their regret bound compared to the lower bound. To solve this problem, we propose a novel optimistic online mirror descent algorithm with an auxiliary initial round using large learning rates. This design enables a refined analysis where a generated negative term cancels the gap-related factor, resolving the impossible tuning issue up to $\log\log T$ factors. Leveraging our improved algorithm as a meta-algorithm, we develop the first UOL algorithm that simultaneously achieves state-of-the-art GV bounds and LA under standard assumptions. Our UOL result overcomes key limitations of prior works, notably resolving the conflict between LA mechanisms and regret analysis for GV bounds -- an open problem highlighted by Xie et al.

Related papers

Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms [1.4747234049753448]
In high-stakes engineering applications, optimization algorithms must come with provable provablecase guarantees over a mathematically defined class of problems.<n>We describe the class of algorithms that achieve linear convergence for classes of nonsmooth composite optimization problems.
arXiv Detail & Related papers (2025-08-01T16:56:42Z)
Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions. Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms. We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z)
Learning Constrained Optimization with Deep Augmented Lagrangian Methods [54.22290715244502]
A machine learning (ML) model is trained to emulate a constrained optimization solver. This paper proposes an alternative approach, in which the ML model is trained to predict dual solution estimates directly. It enables an end-to-end training scheme is which the dual objective is as a loss function, and solution estimates toward primal feasibility, emulating a Dual Ascent method.
arXiv Detail & Related papers (2024-03-06T04:43:22Z)
Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity. We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z)
Online Dynamic Submodular Optimization [0.0]
We propose new algorithms with provable performance for online binary optimization. We numerically test our algorithms in two power system applications: fast-timescale demand response and real-time distribution network reconfiguration.
arXiv Detail & Related papers (2023-06-19T10:37:15Z)
Optimizing NOTEARS Objectives via Topological Swaps [41.18829644248979]
In this paper, we develop an effective method for a set of candidate algorithms. At the inner level, given a subject, we utilize off-the-the-art constraints. The proposed method significantly improves the score of other algorithms.
arXiv Detail & Related papers (2023-05-26T21:49:37Z)
Accelerated First-Order Optimization under Nonlinear Constraints [61.98523595657983]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.<n>An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Learning-Augmented Algorithms for Online Linear and Semidefinite Programming [9.849604820019394]
Semidefinite programming (SDP) is a unifying framework that generalizes both linear and quadratically-constrained programming. There exist known impossibility results for approxing the optimal solution when constraints for covering SDPs arrive in an online fashion. We show that if the predictor is accurate, we can efficiently bypass these impossibility results and achieve a constant-factor approximation to the optimal solution.
arXiv Detail & Related papers (2022-09-21T19:16:29Z)
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities. Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z)
Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z)
An Accelerated Variance-Reduced Conditional Gradient Sliding Algorithm for First-order and Zeroth-order Optimization [111.24899593052851]
Conditional gradient algorithm (also known as the Frank-Wolfe algorithm) has recently regained popularity in the machine learning community. ARCS is the first zeroth-order conditional gradient sliding type algorithms solving convex problems in zeroth-order optimization. In first-order optimization, the convergence results of ARCS substantially outperform previous algorithms in terms of the number of gradient query oracle.
arXiv Detail & Related papers (2021-09-18T07:08:11Z)
Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent [20.47598828422897]
We propose textit-Meta-Regularization, a novel approach for the adaptive choice of the learning rate in first-order descent methods. Our approach modifies the objective function by adding a regularization term, and casts the joint process parameters.
arXiv Detail & Related papers (2021-04-12T13:13:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.