On Dynamic Regret and Constraint Violations in Constrained Online Convex
Optimization
- URL: http://arxiv.org/abs/2301.09808v1
- Date: Tue, 24 Jan 2023 04:22:13 GMT
- Title: On Dynamic Regret and Constraint Violations in Constrained Online Convex
Optimization
- Authors: Rahul Vaze
- Abstract summary: We propose an algorithm that follows projected gradient descent over a suitably chosen set around the current action.
We show that both the dynamic regret and the constraint violation is order-wise bounded by the it path-length, the sum of the distances between the consecutive optimal actions.
- Score: 17.412117389855222
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A constrained version of the online convex optimization (OCO) problem is
considered. With slotted time, for each slot, first an action is chosen.
Subsequently the loss function and the constraint violation penalty evaluated
at the chosen action point is revealed. For each slot, both the loss function
as well as the function defining the constraint set is assumed to be smooth and
strongly convex. In addition, once an action is chosen, local information about
a feasible set within a small neighborhood of the current action is also
revealed. An algorithm is allowed to compute at most one gradient at its point
of choice given the described feedback to choose the next action. The goal of
an algorithm is to simultaneously minimize the dynamic regret (loss incurred
compared to the oracle's loss) and the constraint violation penalty (penalty
accrued compared to the oracle's penalty). We propose an algorithm that follows
projected gradient descent over a suitably chosen set around the current
action. We show that both the dynamic regret and the constraint violation is
order-wise bounded by the {\it path-length}, the sum of the distances between
the consecutive optimal actions. Moreover, we show that the derived bounds are
the best possible.
Related papers
- Distributed Online Bandit Nonconvex Optimization with One-Point Residual Feedback via Dynamic Regret [10.700891331004799]
This paper considers the distributed online bandit optimization problem with non loss functions over a time-varying digraph.
Players select an adversary and then the adversary assigns an arbitrary non-linear loss function to this player.
The expected regret of our algorithms is comparable to existing algorithms that use two-point deviation.
arXiv Detail & Related papers (2024-09-24T02:37:33Z) - Multi-point Feedback of Bandit Convex Optimization with Hard Constraints [1.8130068086063336]
We study bandit convex optimization with constraints, where the learner aims to generate a sequence of decisions under partial information of loss functions.
We adopt the cumulative textithard constraint violation as the metric of constraint violation.
Our algorithm attains $O(d2Tmaxc,1-c)$ regret bounds and $O(d2T1-fracc2)$ cumulative hard constraint violation bounds for convex loss functions and time-varying constraints.
arXiv Detail & Related papers (2023-10-17T02:43:22Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - A Unifying Framework for Online Optimization with Long-Term Constraints [62.35194099438855]
We study online learning problems in which a decision maker has to take a sequence of decisions subject to $m$ long-term constraints.
The goal is to maximize their total reward, while at the same time achieving small cumulative violation across the $T$ rounds.
We present the first best-of-both-world type algorithm for this general class problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown model, and in the case in which they are selected at each round by an adversary.
arXiv Detail & Related papers (2022-09-15T16:59:19Z) - On Constraints in First-Order Optimization: A View from Non-Smooth
Dynamical Systems [99.59934203759754]
We introduce a class of first-order methods for smooth constrained optimization.
Two distinctive features of our approach are that projections or optimizations over the entire feasible set are avoided.
The resulting algorithmic procedure is simple to implement even when constraints are nonlinear.
arXiv Detail & Related papers (2021-07-17T11:45:13Z) - Regret and Cumulative Constraint Violation Analysis for Online Convex
Optimization with Long Term Constraints [24.97580261894342]
This paper considers online convex optimization with long term constraints, where constraints can be violated in intermediate rounds, but need to be satisfied in the long run.
A novel algorithm is first proposed and it achieves an $mathcalO(Tmaxc,1-c)$ bound for static regret and an $mathcalO(T(1-c)/2)$ bound for cumulative constraint violation.
arXiv Detail & Related papers (2021-06-09T15:18:06Z) - Regret and Cumulative Constraint Violation Analysis for Distributed
Online Constrained Convex Optimization [24.97580261894342]
This paper considers the distributed online convex optimization problem with time-varying constraints over a network of agents.
Two algorithms with full-information and bandit feedback are proposed.
arXiv Detail & Related papers (2021-05-01T18:28:53Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments.
We choose the dynamic regret as the performance measure.
We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.