Related papers: Unconstrained Robust Online Convex Optimization

Unconstrained Robust Online Convex Optimization

URL: http://arxiv.org/abs/2506.12781v1
Date: Sun, 15 Jun 2025 09:21:15 GMT
Title: Unconstrained Robust Online Convex Optimization
Authors: Jiujia Zhang, Ashok Cutkosky,
Abstract summary: We focus on the difficult unconstrained'' setting in which our algorithm must maintain low regret.<n>Our algorithms guarantee regret $ |u|G (sqrtT + k) $ when $G ge max_t |g_t|$ is known, where $k$ is a measure of the total amount of corruption.
Score: 38.9652176513385
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses online learning with ``corrupted'' feedback. Our learner is provided with potentially corrupted gradients $\tilde g_t$ instead of the ``true'' gradients $g_t$. We make no assumptions about how the corruptions arise: they could be the result of outliers, mislabeled data, or even malicious interference. We focus on the difficult ``unconstrained'' setting in which our algorithm must maintain low regret with respect to any comparison point $u \in \mathbb{R}^d$. The unconstrained setting is significantly more challenging as existing algorithms suffer extremely high regret even with very tiny amounts of corruption (which is not true in the case of a bounded domain). Our algorithms guarantee regret $ \|u\|G (\sqrt{T} + k) $ when $G \ge \max_t \|g_t\|$ is known, where $k$ is a measure of the total amount of corruption. When $G$ is unknown we incur an extra additive penalty of $(\|u\|^2+G^2) k$.

Related papers

Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries [34.38978643261337]
We study the online shortest path problem in directed acyclic graphs (DAGs) under bandit feedback against an adaptive adversary.<n>We propose the first computationally efficient algorithm to achieve a near-minimax optimal regret bound of $tilde O(sqrt|E|Tlog |X|)$ with high probability against any adaptive adversary.
arXiv Detail & Related papers (2025-04-01T06:35:42Z)
Unconstrained Online Learning with Unbounded Losses [45.801991005821804]
We develop a new setting for online learning with unbounded domains and non-Lipschitz losses. We provide an algorithm which guarantees $R_T(u)le tilde O(G|u|sqrtT+L|u|2sqrtT)$ regret on any problem where the subgradients satisfy $|g_t|le G+L|w_t|$.
arXiv Detail & Related papers (2023-06-08T03:55:33Z)
Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions [98.75618795470524]
We study the linear contextual bandit problem in the presence of adversarial corruption. We propose a new algorithm based on the principle of optimism in the face of uncertainty.
arXiv Detail & Related papers (2022-05-13T17:58:58Z)
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits [99.86860277006318]
We consider the problem of combining and learning over a set of adversarial algorithms with the goal of adaptively tracking the best one on the fly. The CORRAL of Agarwal et al. achieves this goal with a regret overhead of order $widetildeO(sqrtd S T)$ where $M$ is the number of base algorithms and $T$ is the time horizon. Motivated by this issue, we propose a new recipe to corral a larger band of bandit algorithms whose regret overhead has only emphlogarithmic dependence on $M$ as long
arXiv Detail & Related papers (2022-02-12T21:55:44Z)
Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model. We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z)
Corruption Robust Active Learning [43.279169081740726]
We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions. We show that, in a benign corruption setting, the classical RobustCAL framework can achieve nearly the same label complexity guarantee as in the non-corrupted setting. We propose a new algorithm which is provably correct without any assumptions on the presence of corruptions.
arXiv Detail & Related papers (2021-06-21T16:06:38Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure. We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds. We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z)
Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions [53.77572276969548]
We show that the C$2$UCB algorithm has the optimal regret bound $tildeO(dsqrtkT + dk)$ for the partition matroid constraints. For general constraints, we propose an algorithm that modifies the reward estimates of arms in the C$2$UCB algorithm.
arXiv Detail & Related papers (2021-01-20T04:29:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.