Towards Minimax Optimal Best Arm Identification in Linear Bandits
- URL: http://arxiv.org/abs/2105.13017v1
- Date: Thu, 27 May 2021 09:19:10 GMT
- Title: Towards Minimax Optimal Best Arm Identification in Linear Bandits
- Authors: Junwen Yang, Vincent Y. F. Tan
- Abstract summary: We study the problem of best arm identification in linear bandits in the fixed-budget setting.
By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm.
We provide a theoretical analysis of the failure probability of OD-LinBAI.
- Score: 95.22854522340938
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of best arm identification in linear bandits in the
fixed-budget setting. By leveraging properties of the G-optimal design and
incorporating it into the arm allocation rule, we design a parameter-free
algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We
provide a theoretical analysis of the failure probability of OD-LinBAI. While
the performances of existing methods (e.g., BayesGap) depend on all the
optimality gaps, OD-LinBAI depends on the gaps of the top $d$ arms, where $d$
is the effective dimension of the linear bandit instance. Furthermore, we
present a minimax lower bound for this problem. The upper and lower bounds show
that OD-LinBAI is minimax optimal up to multiplicative factors in the exponent.
Finally, numerical experiments corroborate our theoretical findings.
Related papers
- Minimum Empirical Divergence for Sub-Gaussian Linear Bandits [10.750348548547704]
LinMED is a randomized algorithm that admits a closed-form computation of the arm sampling probabilities.
Our empirical study shows that LinMED has a competitive performance with the state-of-the-art algorithms.
arXiv Detail & Related papers (2024-10-31T21:54:44Z) - Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.
We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.
Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z) - Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem.
It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling.
We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z) - Fixed-Budget Best-Arm Identification in Sparse Linear Bandits [69.6194614504832]
We study the best-arm identification problem in sparse linear bandits under the fixed-budget setting.
We design a two-phase algorithm, Lasso and Optimal-Design- (Lasso-OD) based linear best-arm identification.
We show that Lasso-OD is almost minimax optimal in the exponent.
arXiv Detail & Related papers (2023-11-01T12:32:17Z) - PopArt: Efficient Sparse Regression and Experimental Design for Optimal
Sparse Linear Bandits [29.097522376094624]
We propose a simple and computationally efficient sparse linear estimation method called PopArt.
We derive sparse linear bandit algorithms that enjoy improved regret upper bounds upon the state of the art.
arXiv Detail & Related papers (2022-10-25T19:13:20Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Fixed-Budget Best-Arm Identification in Structured Bandits [33.27743152847947]
Best-arm identification (BAI) in a fixed-budget setting is a bandit problem where the learning agent maximizes the probability of identifying the optimal (best) arm after a fixed number of observations.
We propose a general tractable algorithm that incorporates the structure, by eliminating suboptimal arms based on their mean reward estimates from a joint generalization model.
arXiv Detail & Related papers (2021-06-09T01:32:43Z) - Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds.
Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.