Related papers: Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

URL: http://arxiv.org/abs/2306.00149v1
Date: Wed, 31 May 2023 19:39:15 GMT
Title: Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition
Authors: Xinlei Yi, Xiuxian Li, Tao Yang, Lihua Xie, Yiguang Hong, Tianyou Chai, and Karl H. Johansson
Abstract summary: This paper considers distributed online convex optimization with adversarial constraints. Agents collaborate to minimize network regret and cumulative constraint violation. To the best of our knowledge, this paper is the first to achieve reduced (network) cumulative constraint violation bounds for (distributed) online convex optimization with adversarial constraints.
Score: 29.809415829907525
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper considers distributed online convex optimization with adversarial constraints. In this setting, a network of agents makes decisions at each round, and then only a portion of the loss function and a coordinate block of the constraint function are privately revealed to each agent. The loss and constraint functions are convex and can vary arbitrarily across rounds. The agents collaborate to minimize network regret and cumulative constraint violation. A novel distributed online algorithm is proposed and it achieves an $\mathcal{O}(T^{\max\{c,1-c\}})$ network regret bound and an $\mathcal{O}(T^{1-c/2})$ network cumulative constraint violation bound, where $T$ is the number of rounds and $c\in(0,1)$ is a user-defined trade-off parameter. When Slater's condition holds (i.e, there is a point that strictly satisfies the inequality constraints), the network cumulative constraint violation bound is reduced to $\mathcal{O}(T^{1-c})$. Moreover, if the loss functions are strongly convex, then the network regret bound is reduced to $\mathcal{O}(\log(T))$, and the network cumulative constraint violation bound is reduced to $\mathcal{O}(\sqrt{\log(T)T})$ and $\mathcal{O}(\log(T))$ without and with Slater's condition, respectively. To the best of our knowledge, this paper is the first to achieve reduced (network) cumulative constraint violation bounds for (distributed) online convex optimization with adversarial constraints under Slater's condition. Finally, the theoretical results are verified through numerical simulations.

Related papers

An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints [55.2480439325792]
We study Online Convex Optimization (OCO) with adversarial constraints. We focus on a setting where the algorithm has access to predictions of the loss and constraint functions. Our results show that we can improve the current best bounds of $ O(sqrtT) $ regret and $ tildeO(sqrtT) $ cumulative constraint violations.
arXiv Detail & Related papers (2024-12-11T03:06:42Z)
Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications. We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions. Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z)
Multi-point Feedback of Bandit Convex Optimization with Hard Constraints [1.8130068086063336]
We study bandit convex optimization with constraints, where the learner aims to generate a sequence of decisions under partial information of loss functions. We adopt the cumulative textithard constraint violation as the metric of constraint violation. Our algorithm attains $O(d2Tmaxc,1-c)$ regret bounds and $O(d2T1-fracc2)$ cumulative hard constraint violation bounds for convex loss functions and time-varying constraints.
arXiv Detail & Related papers (2023-10-17T02:43:22Z)
Rectified Pessimistic-Optimistic Learning for Stochastic Continuum-armed Bandit with Constraints [4.879346089164413]
We optimize a black-box reward function $f(x)$ subject to a black-box constraint function $g(x)leq 0$ over a continuous space. We propose a Rectified Pessimistic-Optimistic Learning framework (RPOL), a penalty-based method incorporating optimistic and pessimistic GP bandit learning for reward and constraint functions.
arXiv Detail & Related papers (2022-11-27T04:28:16Z)
The Sample Complexity of Online Contract Design [120.9833763323407]
We study the hidden-action principal-agent problem in an online setting. In each round, the principal posts a contract that specifies the payment to the agent based on each outcome. The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal.
arXiv Detail & Related papers (2022-11-10T17:59:42Z)
Private Isotonic Regression [54.32252900997422]
We consider the problem of isotonic regression over a partially ordered set (poset) $mathcalX$ and for any Lipschitz loss function. We obtain a pure-DP algorithm that has an expected excess empirical risk of roughly $mathrmwidth(mathcalX) cdot log|mathcalX| / n$, where $mathrmwidth(mathcalX)$ is the width of the poset. We show that the bounds above are essentially the best that can be
arXiv Detail & Related papers (2022-10-27T05:08:07Z)
Towards Painless Policy Optimization for Constrained MDPs [46.12526917024248]
We study policy optimization in an infinite horizon, $gamma$-discounted constrained Markov decision process (CMDP) Our objective is to return a policy that achieves large expected reward with a small constraint violation. We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms.
arXiv Detail & Related papers (2022-04-11T15:08:09Z)
Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig. We then propose algorithms matching the lower bounds either types of networks (up to log-overs) We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z)
Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints [24.97580261894342]
This paper considers online convex optimization with long term constraints, where constraints can be violated in intermediate rounds, but need to be satisfied in the long run. A novel algorithm is first proposed and it achieves an $mathcalO(Tmaxc,1-c)$ bound for static regret and an $mathcalO(T(1-c)/2)$ bound for cumulative constraint violation.
arXiv Detail & Related papers (2021-06-09T15:18:06Z)
Regret and Cumulative Constraint Violation Analysis for Distributed Online Constrained Convex Optimization [24.97580261894342]
This paper considers the distributed online convex optimization problem with time-varying constraints over a network of agents. Two algorithms with full-information and bandit feedback are proposed.
arXiv Detail & Related papers (2021-05-01T18:28:53Z)
Gradient-Variation Bound for Online Convex Optimization with Constraints [25.002868073267464]
We study online convex optimization with constraints consisting of multiple functional constraints and a relatively simple constraint set, such as a Euclidean ball. First-order methods achieve an $mathcalO(sqrtT)$ regret and an $mathcalO(1)$ constraint violation, but do not take into account the structural information of the problem. In this paper, we provide an emphinstance-dependent bound for online convex optimization with complex constraints obtained by a novel online primal-dual mirror-prox algorithm.
arXiv Detail & Related papers (2020-06-22T17:38:14Z)
Stochastic Bandits with Linear Constraints [69.757694218456]
We study a constrained contextual linear bandit setting, where the goal of the agent is to produce a sequence of policies. We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB)
arXiv Detail & Related papers (2020-06-17T22:32:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.