A Simple Unified Framework for High Dimensional Bandit Problems
- URL: http://arxiv.org/abs/2102.09626v1
- Date: Thu, 18 Feb 2021 21:35:32 GMT
- Title: A Simple Unified Framework for High Dimensional Bandit Problems
- Authors: Wenjie Li and Adarsh Barik and Jean Honorio
- Abstract summary: We present a general analysis framework for the regret upper bound of our algorithm.
We show that our algorithm can be applied to different high dimensional bandit problems.
- Score: 33.139925285802825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic high dimensional bandit problems with low dimensional structure
are useful in different applications such as online advertising and drug
discovery. In this work, we propose a simple unified algorithm for such
problems and present a general analysis framework for the regret upper bound of
our algorithm. We show that under some mild unified assumptions, our algorithm
can be applied to different high dimensional bandit problems. Our framework
utilizes the low dimensional structure to guide the parameter estimation in the
problem, therefore our algorithm achieves the best regret bounds in the LASSO
bandit, better bounds in the low-rank matrix bandit and the group sparse matrix
bandit, as well as a novel bound in the multi-agent LASSO bandit.
Related papers
- Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits [15.342585350280535]
We study how representation learning can improve the learning efficiency of contextual bandit problems.
We present a new algorithm based on alternating projected gradient descent (GD) and minimization estimator.
arXiv Detail & Related papers (2024-10-02T22:30:29Z) - Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem.
It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling.
We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z) - An Optimal Algorithm for the Real-Valued Combinatorial Pure Exploration
of Multi-Armed Bandit [65.268245109828]
We study the real-valued pure exploration problem in the multi-armed bandit (R-CPE-MAB)
Existing methods in the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits.
We propose an algorithm named the gap-based exploration (CombGapE) algorithm, whose sample complexity matches the lower bound.
arXiv Detail & Related papers (2023-06-15T15:37:31Z) - Optimal Gradient-based Algorithms for Non-concave Bandit Optimization [76.57464214864756]
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave.
Our algorithms are based on a unified zeroth-order optimization paradigm that applies in great generality.
We show that the standard optimistic algorithms are sub-optimal by dimension factors.
arXiv Detail & Related papers (2021-07-09T16:04:24Z) - Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in
Contextual Bandit Algorithms [74.55200180156906]
The contextual bandit problem models the trade-off between exploration and exploitation.
We show our Syndicated Bandits framework can achieve the optimal regret upper bounds.
arXiv Detail & Related papers (2021-06-05T22:30:21Z) - Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms.
Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z) - Approximation Theory Based Methods for RKHS Bandits [9.391375268580806]
The RKHS bandit problem is an online optimization problem of non-linear functions with noisy feedback.
There is no general algorithm for the adversarial RKHS bandit problem.
We propose efficient algorithms for the RKHS bandit problem and the first general algorithm for the adversarial RKHS bandit problem.
arXiv Detail & Related papers (2020-10-23T05:14:21Z) - A Novel Confidence-Based Algorithm for Structured Bandits [129.30402124516507]
We study finite-armed bandits where the rewards of each arm might be correlated to those of other arms.
We introduce a novel phased algorithm that exploits the given structure to build confidence sets over the parameters of the true bandit problem.
arXiv Detail & Related papers (2020-05-23T19:52:44Z) - Bandit algorithms to emulate human decision making using probabilistic
distortions [20.422725678982726]
We formulate two sample multi-armed bandit problems with distorted probabilities on the reward distributions.
We consider the aforementioned problems in the regret minimization as well as best arm identification framework for multi-armed bandits.
arXiv Detail & Related papers (2016-11-30T17:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.