Statistically Efficient, Polynomial Time Algorithms for Combinatorial
Semi Bandits
- URL: http://arxiv.org/abs/2002.07258v2
- Date: Wed, 13 Jan 2021 17:12:58 GMT
- Title: Statistically Efficient, Polynomial Time Algorithms for Combinatorial
Semi Bandits
- Authors: Thibaut Cuvelier and Richard Combes and Eric Gourdin
- Abstract summary: We consider semi-bandits over a set of arms $cal X subset 0,1d$ where rewards are uncorrelated across items.
For this problem, the algorithm ESCB yields the smallest known regret bound $R(T) = cal OBig( d (ln m)2 (ln T) over Delta_min Big)$, but it has computational complexity $cal O(|cal X|)$ which is typically exponential in $d$, and cannot
- Score: 3.9919781077062497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider combinatorial semi-bandits over a set of arms ${\cal X} \subset
\{0,1\}^d$ where rewards are uncorrelated across items. For this problem, the
algorithm ESCB yields the smallest known regret bound $R(T) = {\cal O}\Big( {d
(\ln m)^2 (\ln T) \over \Delta_{\min} }\Big)$, but it has computational
complexity ${\cal O}(|{\cal X}|)$ which is typically exponential in $d$, and
cannot be used in large dimensions. We propose the first algorithm which is
both computationally and statistically efficient for this problem with regret
$R(T) = {\cal O} \Big({d (\ln m)^2 (\ln T)\over \Delta_{\min} }\Big)$ and
computational complexity ${\cal O}(T {\bf poly}(d))$. Our approach involves
carefully designing an approximate version of ESCB with the same regret
guarantees, showing that this approximate algorithm can be implemented in time
${\cal O}(T {\bf poly}(d))$ by repeatedly maximizing a linear function over
${\cal X}$ subject to a linear budget constraint, and showing how to solve this
maximization problems efficiently.
Related papers
- Implicit High-Order Moment Tensor Estimation and Learning Latent Variable Models [39.33814194788341]
We study the task of learning latent-variable models.
Motivated by such applications, we develop a general efficient algorithm for implicit moment computation.
By leveraging our general algorithm, we obtain the first-time learners for the following models.
arXiv Detail & Related papers (2024-11-23T23:13:24Z) - Dueling Optimization with a Monotone Adversary [35.850072415395395]
We study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization.
The goal is to design an online algorithm to find a minimizer $mathbfx*$ for a function $fcolon X to mathbbRd.
arXiv Detail & Related papers (2023-11-18T23:55:59Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Memory-Constrained Algorithms for Convex Optimization via Recursive
Cutting-Planes [23.94542304111204]
First class of algorithms that provides a positive trade-off between gradient descent and cutting-plane methods in any regime with $epsilonleq 1/sqrt d$.
In the regime $epsilon leq d-Omega(d)$, our algorithm with $p=d$ achieves the information-theoretic optimal memory usage and improves the oracle-complexity of gradient descent.
arXiv Detail & Related papers (2023-06-16T17:00:51Z) - An Optimal Algorithm for Strongly Convex Min-min Optimization [79.11017157526815]
Existing optimal first-order methods require $mathcalO(sqrtmaxkappa_x,kappa_y log 1/epsilon)$ of computations of both $nabla_x f(x,y)$ and $nabla_y f(x,y)$.
We propose a new algorithm that only requires $mathcalO(sqrtkappa_x log 1/epsilon)$ of computations of $nabla_x f(x,
arXiv Detail & Related papers (2022-12-29T19:26:12Z) - Computational Complexity of Normalizing Constants for the Product of
Determinantal Point Processes [12.640283469603357]
We study the computational complexity of computing the normalizing constant.
We show that $sum_Sdet(bf A_S,S)p$ exactly for every (fixed) positive even integer $p$ is UP-hard and Mod$_3$P-hard.
arXiv Detail & Related papers (2021-11-28T14:08:25Z) - Contextual Recommendations and Low-Regret Cutting-Plane Algorithms [49.91214213074933]
We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems.
We design novel cutting-plane algorithms with low "regret" -- the total distance between the true point $w*$ and the hyperplanes the separation oracle returns.
arXiv Detail & Related papers (2021-06-09T05:39:05Z) - An Efficient Pessimistic-Optimistic Algorithm for Constrained Linear
Bandits [16.103685478563072]
This paper considers linear bandits with general constraints.
The objective is to maximize the expected cumulative reward over horizon $T$ subject to a set of constraints in each round.
We propose a pessimistic-optimistic algorithm for this problem, which is efficient in two aspects.
arXiv Detail & Related papers (2021-02-10T07:30:37Z) - Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits
with Linear Payoff Functions [53.77572276969548]
We show that the C$2$UCB algorithm has the optimal regret bound $tildeO(dsqrtkT + dk)$ for the partition matroid constraints.
For general constraints, we propose an algorithm that modifies the reward estimates of arms in the C$2$UCB algorithm.
arXiv Detail & Related papers (2021-01-20T04:29:18Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.