Related papers: A Simple Unified Framework for High Dimensional Bandit Problems

A Simple Unified Framework for High Dimensional Bandit Problems

URL: http://arxiv.org/abs/2102.09626v1
Date: Thu, 18 Feb 2021 21:35:32 GMT
Title: A Simple Unified Framework for High Dimensional Bandit Problems
Authors: Wenjie Li and Adarsh Barik and Jean Honorio
Abstract summary: We present a general analysis framework for the regret upper bound of our algorithm. We show that our algorithm can be applied to different high dimensional bandit problems.
Score: 33.139925285802825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stochastic high dimensional bandit problems with low dimensional structure are useful in different applications such as online advertising and drug discovery. In this work, we propose a simple unified algorithm for such problems and present a general analysis framework for the regret upper bound of our algorithm. We show that under some mild unified assumptions, our algorithm can be applied to different high dimensional bandit problems. Our framework utilizes the low dimensional structure to guide the parameter estimation in the problem, therefore our algorithm achieves the best regret bounds in the LASSO bandit, better bounds in the low-rank matrix bandit and the group sparse matrix bandit, as well as a novel bound in the multi-agent LASSO bandit.

Related papers

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits [15.342585350280535]
We study how representation learning can improve the learning efficiency of contextual bandit problems. We present a new algorithm based on alternating projected gradient descent (GD) and minimization estimator.
arXiv Detail & Related papers (2024-10-02T22:30:29Z)
Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem. It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z)
An Optimal Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit [65.268245109828]
We study the real-valued pure exploration problem in the multi-armed bandit (R-CPE-MAB) Existing methods in the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We propose an algorithm named the gap-based exploration (CombGapE) algorithm, whose sample complexity matches the lower bound.
arXiv Detail & Related papers (2023-06-15T15:37:31Z)
Max-Min Grouped Bandits [48.62520520818357]
We introduce a multi-armed bandit problem termed max-min grouped bandits. The goal is to find a group whose worst arm has the highest mean reward. This problem is of interest to applications such as recommendation systems.
arXiv Detail & Related papers (2021-11-17T01:59:15Z)
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization [76.57464214864756]
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave. Our algorithms are based on a unified zeroth-order optimization paradigm that applies in great generality. We show that the standard optimistic algorithms are sub-optimal by dimension factors.
arXiv Detail & Related papers (2021-07-09T16:04:24Z)
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms [74.55200180156906]
The contextual bandit problem models the trade-off between exploration and exploitation. We show our Syndicated Bandits framework can achieve the optimal regret upper bounds.
arXiv Detail & Related papers (2021-06-05T22:30:21Z)
Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms. Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z)
Approximation Theory Based Methods for RKHS Bandits [9.391375268580806]
The RKHS bandit problem is an online optimization problem of non-linear functions with noisy feedback. There is no general algorithm for the adversarial RKHS bandit problem. We propose efficient algorithms for the RKHS bandit problem and the first general algorithm for the adversarial RKHS bandit problem.
arXiv Detail & Related papers (2020-10-23T05:14:21Z)
A Novel Confidence-Based Algorithm for Structured Bandits [129.30402124516507]
We study finite-armed bandits where the rewards of each arm might be correlated to those of other arms. We introduce a novel phased algorithm that exploits the given structure to build confidence sets over the parameters of the true bandit problem.
arXiv Detail & Related papers (2020-05-23T19:52:44Z)
Bandit algorithms to emulate human decision making using probabilistic distortions [20.422725678982726]
We formulate two sample multi-armed bandit problems with distorted probabilities on the reward distributions. We consider the aforementioned problems in the regret minimization as well as best arm identification framework for multi-armed bandits.
arXiv Detail & Related papers (2016-11-30T17:37:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.