Related papers: Beyond Worst-Case Online Classification: VC-Based Regret Bounds for Relaxed Benchmarks

Beyond Worst-Case Online Classification: VC-Based Regret Bounds for Relaxed Benchmarks

URL: http://arxiv.org/abs/2504.10598v1
Date: Mon, 14 Apr 2025 18:00:23 GMT
Title: Beyond Worst-Case Online Classification: VC-Based Regret Bounds for Relaxed Benchmarks
Authors: Omar Montasser, Abhishek Shetty, Nikita Zhivotovskiy,
Abstract summary: We revisit online binary classification by shifting the focus from competing with the best-in-class binary loss to competing against relaxed benchmarks.<n>We consider comparing predictors that are robust to small input perturbations, perform well under Gaussian smoothing, or maintain a prescribed output margin.<n>Our algorithms achieve regret guarantees that depend only on the VC dimension and the complexity of the instance space.
Score: 19.642496463491053
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We revisit online binary classification by shifting the focus from competing with the best-in-class binary loss to competing against relaxed benchmarks that capture smoothed notions of optimality. Instead of measuring regret relative to the exact minimal binary error -- a standard approach that leads to worst-case bounds tied to the Littlestone dimension -- we consider comparing with predictors that are robust to small input perturbations, perform well under Gaussian smoothing, or maintain a prescribed output margin. Previous examples of this were primarily limited to the hinge loss. Our algorithms achieve regret guarantees that depend only on the VC dimension and the complexity of the instance space (e.g., metric entropy), and notably, they incur only an $O(\log(1/\gamma))$ dependence on the generalized margin $\gamma$. This stands in contrast to most existing regret bounds, which typically exhibit a polynomial dependence on $1/\gamma$. We complement this with matching lower bounds. Our analysis connects recent ideas from adversarial robustness and smoothed online learning.

Related papers

Non-stationary Bandit Convex Optimization: A Comprehensive Study [28.086802754034828]
Bandit Convex Optimization is a class of sequential decision-making problems.<n>We study this problem in non-stationary environments.<n>We aim to minimize the regret under three standard measures of non-stationarity.
arXiv Detail & Related papers (2025-06-03T15:18:41Z)
An Equivalence Between Static and Dynamic Regret Minimization [10.812831455376218]
We show that for linear losses, dynamic regret minimization is equivalent to static regret minimization in an extended decision space. We provide an algorithm guaranteeing dynamic regret of the form $R_T(u_1,dots,u_T)le tilde.
arXiv Detail & Related papers (2024-06-03T17:54:58Z)
No-Regret is not enough! Bandits with General Constraints through Adaptive Regret Minimization [26.415300249303748]
We show that it is possible to circumvent the issue of sublinear violations of constraints by requiring the primal and dual algorithm to be weakly adaptive. In the first case, we show that the algorithm guarantees sublinear regret. In the latter case, we establish a tight competitive ratio of $rho/(1+rho)$. This results allow us to obtain new result for the problem of contextual bandits with linear constraints.
arXiv Detail & Related papers (2024-05-10T16:22:33Z)
Adversarial Contextual Bandits Go Kernelized [21.007410990554522]
We study a generalization of the problem of online learning in adversarial linear contextual bandits by incorporating loss functions that belong to a Hilbert kernel space. We propose a new optimistically biased estimator for the loss functions and reproducing near-optimal regret guarantees.
arXiv Detail & Related papers (2023-10-02T19:59:39Z)
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits [53.281230333364505]
This paper studies the problem of contextual dueling bandits, where the binary comparison of dueling arms is generated from a generalized linear model (GLM) We propose a new SupLinUCB-type algorithm that enjoys computational efficiency and a variance-aware regret bound $tilde Obig(dsqrtsum_t=1Tsigma_t2 + dbig)$. Our regret bound naturally aligns with the intuitive expectation in scenarios where the comparison is deterministic, the algorithm only suffers from an $tilde O(d)$ regret.
arXiv Detail & Related papers (2023-10-02T08:15:52Z)
Data-Dependent Bounds for Online Portfolio Selection Without Lipschitzness and Smoothness [2.315156126698557]
We introduce the first instances of data-dependent bounds for online convex optimization with non-Lipschitz, non-smooth losses. The algorithms we propose exhibit sublinear regret rates in the worst cases and achieve logarithmic regrets when the data is "easy"
arXiv Detail & Related papers (2023-05-23T11:16:01Z)
The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches [84.54669549718075]
We study the problem of regret minimization for episodic Reinforcement Learning (RL) We focus on learning with general function classes and general model classes. We show that a logarithmic regret bound is realizable by algorithms with $O(log T)$ switching cost.
arXiv Detail & Related papers (2022-03-03T02:55:55Z)
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities. Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z)
Dynamic Regret for Strongly Adaptive Methods and Optimality of Online KRR [13.165557713537389]
We show that Strongly Adaptive (SA) algorithms can be viewed as a principled way of controlling dynamic regret. We derive a new lower bound on a certain penalized regret which establishes the near minimax optimality of online Kernel Ridge Regression (KRR)
arXiv Detail & Related papers (2021-11-22T21:52:47Z)
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning [52.76230802067506]
A novel model-free algorithm is proposed to minimize regret in episodic reinforcement learning. The proposed algorithm employs an em early-settled reference update rule, with the aid of two Q-learning sequences. The design principle of our early-settled variance reduction method might be of independent interest to other RL settings.
arXiv Detail & Related papers (2021-10-09T21:13:48Z)
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm. We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z)
Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments. We choose the dynamic regret as the performance measure. We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.