Simple Combinatorial Algorithms for Combinatorial Bandits: Corruptions
and Approximations
- URL: http://arxiv.org/abs/2106.06712v1
- Date: Sat, 12 Jun 2021 08:14:53 GMT
- Title: Simple Combinatorial Algorithms for Combinatorial Bandits: Corruptions
and Approximations
- Authors: Haike Xu, Jian Li
- Abstract summary: We provide a simple algorithm that can achieve a regret of $tildeOleft(C+d2K/Delta_minright)$ where $C$ is the total amount of corruptions.
Our algorithm is very simple, only worse than the best known regret bound by $sqrtd$, and has much lower oracle complexity than previous work.
- Score: 6.2140753425381785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the stochastic combinatorial semi-bandit problem with adversarial
corruptions. We provide a simple combinatorial algorithm that can achieve a
regret of $\tilde{O}\left(C+d^2K/\Delta_{min}\right)$ where $C$ is the total
amount of corruptions, $d$ is the maximal number of arms one can play in each
round, $K$ is the number of arms. If one selects only one arm in each round, we
achieves a regret of $\tilde{O}\left(C+\sum_{\Delta_i>0}(1/\Delta_i)\right)$.
Our algorithm is combinatorial and improves on the previous combinatorial
algorithm by [Gupta et al., COLT2019] (their bound is
$\tilde{O}\left(KC+\sum_{\Delta_i>0}(1/\Delta_i)\right)$), and almost matches
the best known bounds obtained by [Zimmert et al., ICML2019] and [Zimmert and
Seldin, AISTATS2019] (up to logarithmic factor). Note that the algorithms in
[Zimmert et al., ICML2019] and [Zimmert and Seldin, AISTATS2019] require one to
solve complex convex programs while our algorithm is combinatorial, very easy
to implement, requires weaker assumptions and has very low oracle complexity
and running time. We also study the setting where we only get access to an
approximation oracle for the stochastic combinatorial semi-bandit problem. Our
algorithm achieves an (approximation) regret bound of
$\tilde{O}\left(d\sqrt{KT}\right)$. Our algorithm is very simple, only worse
than the best known regret bound by $\sqrt{d}$, and has much lower oracle
complexity than previous work.
Related papers
- The Oracle Complexity of Simplex-based Matrix Games: Linear Separability and Nash Equilibria [37.300102993926046]
We study the problem of solving matrix games of the form $max_mathbfwinmathcalWmin_mathbfpinDeltamathbfptopAmathbfw$.
This problem encapsulates canonical tasks such as finding a linear separator and computing Nash equilibria in zero-sum games.
arXiv Detail & Related papers (2024-12-09T20:58:26Z) - Dueling Optimization with a Monotone Adversary [35.850072415395395]
We study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization.
The goal is to design an online algorithm to find a minimizer $mathbfx*$ for a function $fcolon X to mathbbRd.
arXiv Detail & Related papers (2023-11-18T23:55:59Z) - Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback [30.337826346496385]
We study online reinforcement learning in linear Markov decision processes with adversarial losses and bandit feedback.
We introduce two algorithms that achieve improved regret performance compared to existing approaches.
arXiv Detail & Related papers (2023-10-17T19:43:37Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Private estimation algorithms for stochastic block models and mixture
models [63.07482515700984]
General tools for designing efficient private estimation algorithms.
First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z) - Corralling a Larger Band of Bandits: A Case Study on Switching Regret
for Linear Bandits [99.86860277006318]
We consider the problem of combining and learning over a set of adversarial algorithms with the goal of adaptively tracking the best one on the fly.
The CORRAL of Agarwal et al. achieves this goal with a regret overhead of order $widetildeO(sqrtd S T)$ where $M$ is the number of base algorithms and $T$ is the time horizon.
Motivated by this issue, we propose a new recipe to corral a larger band of bandit algorithms whose regret overhead has only emphlogarithmic dependence on $M$ as long
arXiv Detail & Related papers (2022-02-12T21:55:44Z) - Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax
Optimization [47.27237492375659]
We study the bilinearly coupled minimax problem: $min_x max_y f(x) + ytop A x - h(y)$, where $f$ and $h$ are both strongly convex smooth functions.
No known first-order algorithms have hitherto achieved the lower complexity bound of $Omega(sqrtfracL_xmu_x + frac|A|sqrtmu_x,mu_y) log(frac1vareps
arXiv Detail & Related papers (2022-01-19T05:56:19Z) - Contextual Recommendations and Low-Regret Cutting-Plane Algorithms [49.91214213074933]
We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems.
We design novel cutting-plane algorithms with low "regret" -- the total distance between the true point $w*$ and the hyperplanes the separation oracle returns.
arXiv Detail & Related papers (2021-06-09T05:39:05Z) - Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits
with Linear Payoff Functions [53.77572276969548]
We show that the C$2$UCB algorithm has the optimal regret bound $tildeO(dsqrtkT + dk)$ for the partition matroid constraints.
For general constraints, we propose an algorithm that modifies the reward estimates of arms in the C$2$UCB algorithm.
arXiv Detail & Related papers (2021-01-20T04:29:18Z) - Practical and Parallelizable Algorithms for Non-Monotone Submodular
Maximization with Size Constraint [20.104148319012854]
We present and parallelizable for a submodular function, not necessarily a monotone, with respect to a size constraint.
We improve the best approximation factor achieved by an algorithm that has optimal adaptivity and nearly optimal complexity query to $0.193 - varepsilon$.
arXiv Detail & Related papers (2020-09-03T22:43:55Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Statistically Efficient, Polynomial Time Algorithms for Combinatorial
Semi Bandits [3.9919781077062497]
We consider semi-bandits over a set of arms $cal X subset 0,1d$ where rewards are uncorrelated across items.
For this problem, the algorithm ESCB yields the smallest known regret bound $R(T) = cal OBig( d (ln m)2 (ln T) over Delta_min Big)$, but it has computational complexity $cal O(|cal X|)$ which is typically exponential in $d$, and cannot
arXiv Detail & Related papers (2020-02-17T21:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.