Bayesian Fixed-Budget Best-Arm Identification
- URL: http://arxiv.org/abs/2211.08572v3
- Date: Thu, 15 Jun 2023 13:29:25 GMT
- Title: Bayesian Fixed-Budget Best-Arm Identification
- Authors: Alexia Atsidakou, Sumeet Katariya, Sujay Sanghavi, Branislav Kveton
- Abstract summary: Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations.
We propose a Bayesian elimination algorithm and derive an upper bound on its probability of misidentifying the optimal arm.
- Score: 24.31655036648236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fixed-budget best-arm identification (BAI) is a bandit problem where the
agent maximizes the probability of identifying the optimal arm within a fixed
budget of observations. In this work, we study this problem in the Bayesian
setting. We propose a Bayesian elimination algorithm and derive an upper bound
on its probability of misidentifying the optimal arm. The bound reflects the
quality of the prior and is the first distribution-dependent bound in this
setting. We prove it using a frequentist-like argument, where we carry the
prior through, and then integrate out the bandit instance at the end. We also
provide a lower bound on the probability of misidentification in a $2$-armed
Bayesian bandit and show that our upper bound (almost) matches it for any
budget. Our experiments show that Bayesian elimination is superior to
frequentist methods and competitive with the state-of-the-art Bayesian
algorithms that have no guarantees in our setting.
Related papers
- UCB Exploration for Fixed-Budget Bayesian Best Arm Identification [0.0]
We study best-arm identification (BAI) in the fixed-budget setting.
We propose an UCB exploration algorithm that is both theoretically and empirically efficient for the fixed budget BAI problem under a Bayesian setting.
arXiv Detail & Related papers (2024-08-09T05:15:36Z) - Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget [6.22018632187078]
We introduce the constrained best mixed arm identification (CBMAI) problem with a fixed budget.
The goal is to find the best mixed arm that maximizes the expected reward subject to constraints on the expected costs with a given learning budget $N$.
We provide a theoretical upper bound on the mis-identification (of the the support of the best mixed arm) probability and show that it decays exponentially in the budget $N$.
arXiv Detail & Related papers (2024-05-23T22:35:11Z) - Fixed Confidence Best Arm Identification in the Bayesian Setting [6.083234045523298]
We consider the fixed-confidence best arm identification (FC-BAI) problem in the Bayesian setting.
This problem aims to find the arm of the largest mean with a fixed confidence level when the bandit model has been sampled from the known prior.
arXiv Detail & Related papers (2024-02-16T03:36:03Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Best Arm Identification under Additive Transfer Bandits [49.69203462561861]
We study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances.
We propose and theoretically analyze an LUCB-style algorithm to identify an $epsilon$-optimal target arm with high probability.
arXiv Detail & Related papers (2021-12-08T02:20:18Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Guaranteed Fixed-Confidence Best Arm Identification in Multi-Armed
Bandit [1.5469452301122177]
We consider the problem of finding, through adaptive sampling, which of n populations (arms) has the largest mean.
Our objective is to determine a rule which identifies the best population with a fixed minimum confidence using as few observations as possible.
arXiv Detail & Related papers (2021-06-12T20:05:29Z) - Towards Minimax Optimal Best Arm Identification in Linear Bandits [95.22854522340938]
We study the problem of best arm identification in linear bandits in the fixed-budget setting.
By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm.
We provide a theoretical analysis of the failure probability of OD-LinBAI.
arXiv Detail & Related papers (2021-05-27T09:19:10Z) - Quantile Bandits for Best Arms Identification [10.294977861990203]
We consider a variant of the best arm identification task in multi-armed bandits.
Motivated by risk-averse decision-making problems, our goal is to identify a set of $m$ arms with the highest $tau$-quantile values within a fixed budget.
arXiv Detail & Related papers (2020-10-22T09:58:54Z) - Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds.
Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z) - Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean.
We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.