Revisiting Projection-Free Online Learning with Time-Varying Constraints
- URL: http://arxiv.org/abs/2501.16046v1
- Date: Mon, 27 Jan 2025 13:38:51 GMT
- Title: Revisiting Projection-Free Online Learning with Time-Varying Constraints
- Authors: Yibo Wang, Yuanyu Wan, Lijun Zhang,
- Abstract summary: We investigate constrained online convex optimization, in which decisions must belong to a fixed and typically complicated domain.
Several projection-free methods have been proposed with an $mathcalO(T3/4 sqrtlog T)$ regret bound and an $mathcalO(T3/4 sqrtlog T)$ cumulative constraint violation (CCV) bound for general convex losses.
In this paper, we improve this result and further establish textitnovel regret and CCV bounds when loss functions are strongly convex
- Score: 35.573654458435854
- License:
- Abstract: We investigate constrained online convex optimization, in which decisions must belong to a fixed and typically complicated domain, and are required to approximately satisfy additional time-varying constraints over the long term. In this setting, the commonly used projection operations are often computationally expensive or even intractable. To avoid the time-consuming operation, several projection-free methods have been proposed with an $\mathcal{O}(T^{3/4} \sqrt{\log T})$ regret bound and an $\mathcal{O}(T^{7/8})$ cumulative constraint violation (CCV) bound for general convex losses. In this paper, we improve this result and further establish \textit{novel} regret and CCV bounds when loss functions are strongly convex. The primary idea is to first construct a composite surrogate loss, involving the original loss and constraint functions, by utilizing the Lyapunov-based technique. Then, we propose a parameter-free variant of the classical projection-free method, namely online Frank-Wolfe (OFW), and run this new extension over the online-generated surrogate loss. Theoretically, for general convex losses, we achieve an $\mathcal{O}(T^{3/4})$ regret bound and an $\mathcal{O}(T^{3/4} \log T)$ CCV bound, both of which are order-wise tighter than existing results. For strongly convex losses, we establish new guarantees of an $\mathcal{O}(T^{2/3})$ regret bound and an $\mathcal{O}(T^{5/6})$ CCV bound. Moreover, we also extend our methods to a more challenging setting with bandit feedback, obtaining similar theoretical findings. Empirically, experiments on real-world datasets have demonstrated the effectiveness of our methods.
Related papers
- An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints [55.2480439325792]
We study Online Convex Optimization (OCO) with adversarial constraints.
We focus on a setting where the algorithm has access to predictions of the loss and constraint functions.
Our results show that we can improve the current best bounds of $ O(sqrtT) $ regret and $ tildeO(sqrtT) $ cumulative constraint violations.
arXiv Detail & Related papers (2024-12-11T03:06:42Z) - Optimistic Safety for Online Convex Optimization with Unknown Linear Constraints [31.526232903811533]
We introduce an algorithm that we term Optimistically Safe OCO (OSOCO) and show it enjoys $tildeO(sqrtT)$ regret and no constraint violation.
In the case of static linear constraints, this improves on the previous best known $tildeO(T2/3)$ regret under the same assumptions.
In the case of time-varying constraints, our work supplements existing results that show $O(sqrtT)$ regret and $O(sqrtT)$ cumulative violation
arXiv Detail & Related papers (2024-03-09T04:01:39Z) - Multi-point Feedback of Bandit Convex Optimization with Hard Constraints [1.8130068086063336]
We study bandit convex optimization with constraints, where the learner aims to generate a sequence of decisions under partial information of loss functions.
We adopt the cumulative textithard constraint violation as the metric of constraint violation.
Our algorithm attains $O(d2Tmaxc,1-c)$ regret bounds and $O(d2T1-fracc2)$ cumulative hard constraint violation bounds for convex loss functions and time-varying constraints.
arXiv Detail & Related papers (2023-10-17T02:43:22Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - Riemannian Projection-free Online Learning [5.918057694291832]
The projection operation is a critical component in a range of optimization algorithms, such as online gradient descent (OGD)
It suffers from computational limitations in high-dimensional settings or when dealing with ill-conditioned constraint sets.
This paper presents methods for obtaining sub-linear regret guarantees in online geodesically convex optimization on curved spaces.
arXiv Detail & Related papers (2023-05-30T18:22:09Z) - Regret and Cumulative Constraint Violation Analysis for Online Convex
Optimization with Long Term Constraints [24.97580261894342]
This paper considers online convex optimization with long term constraints, where constraints can be violated in intermediate rounds, but need to be satisfied in the long run.
A novel algorithm is first proposed and it achieves an $mathcalO(Tmaxc,1-c)$ bound for static regret and an $mathcalO(T(1-c)/2)$ bound for cumulative constraint violation.
arXiv Detail & Related papers (2021-06-09T15:18:06Z) - Projection-free Online Learning over Strongly Convex Sets [24.517908972536432]
We study the special case of online learning over strongly convex sets, for which we first prove that OFW can enjoy a better regret bound of $O(T2/3)$ for general convex losses.
We show that it achieves a regret bound of $O(sqrtT)$ over general convex sets and a better regret bound of $O(sqrtT)$ over strongly convex sets.
arXiv Detail & Related papers (2020-10-16T05:42:50Z) - Efficient Optimistic Exploration in Linear-Quadratic Regulators via
Lagrangian Relaxation [107.06364966905821]
We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting.
Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of ofulq.
We show that an $epsilon$-optimistic controller can be computed efficiently by solving at most $Obig(log (1/epsilon)big)$ Riccati equations.
arXiv Detail & Related papers (2020-07-13T16:30:47Z) - Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments.
We choose the dynamic regret as the performance measure.
We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z) - Upper Confidence Primal-Dual Reinforcement Learning for CMDP with
Adversarial Loss [145.54544979467872]
We consider online learning for episodically constrained Markov decision processes (CMDPs)
We propose a new emphupper confidence primal-dual algorithm, which only requires the trajectories sampled from the transition model.
Our analysis incorporates a new high-probability drift analysis of Lagrange multiplier processes into the celebrated regret analysis of upper confidence reinforcement learning.
arXiv Detail & Related papers (2020-03-02T05:02:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.