Fixed Confidence Best Arm Identification in the Bayesian Setting
- URL: http://arxiv.org/abs/2402.10429v2
- Date: Sun, 23 Jun 2024 03:50:12 GMT
- Title: Fixed Confidence Best Arm Identification in the Bayesian Setting
- Authors: Kyoungseok Jang, Junpei Komiyama, Kazutoshi Yamazaki,
- Abstract summary: We consider the fixed-confidence best arm identification (FC-BAI) problem in the Bayesian setting.
This problem aims to find the arm of the largest mean with a fixed confidence level when the bandit model has been sampled from the known prior.
- Score: 6.083234045523298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the fixed-confidence best arm identification (FC-BAI) problem in the Bayesian setting. This problem aims to find the arm of the largest mean with a fixed confidence level when the bandit model has been sampled from the known prior. Most studies on the FC-BAI problem have been conducted in the frequentist setting, where the bandit model is predetermined before the game starts. We show that the traditional FC-BAI algorithms studied in the frequentist setting, such as track-and-stop and top-two algorithms, result in arbitrarily suboptimal performances in the Bayesian setting. We also obtain a lower bound of the expected number of samples in the Bayesian setting and introduce a variant of successive elimination that has a matching performance with the lower bound up to a logarithmic factor. Simulations verify the theoretical results.
Related papers
- Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Bayesian Fixed-Budget Best-Arm Identification [24.31655036648236]
Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations.
We propose a Bayesian elimination algorithm and derive an upper bound on its probability of misidentifying the optimal arm.
arXiv Detail & Related papers (2022-11-15T23:29:51Z) - SPRT-based Efficient Best Arm Identification in Stochastic Bandits [31.359578768463752]
This paper investigates the best arm identification problem in multi-armed bandits in the fixed confidence setting.
Existing algorithms for the exponential family of bandits face computational challenges.
A framework is proposed that adopts the likelihood ratio-based tests known to be effective for sequential testing.
arXiv Detail & Related papers (2022-07-22T15:54:53Z) - Rate-optimal Bayesian Simple Regret in Best Arm Identification [11.389780431092914]
We consider best arm identification in the multi-armed bandit problem.
We propose a simple and easy-to-compute algorithm with its leading term matching with the lower bound up to a constant factor.
arXiv Detail & Related papers (2021-11-18T18:59:35Z) - Bayesian decision-making under misspecified priors with applications to
meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits.
We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z) - Guaranteed Fixed-Confidence Best Arm Identification in Multi-Armed
Bandit [1.5469452301122177]
We consider the problem of finding, through adaptive sampling, which of n populations (arms) has the largest mean.
Our objective is to determine a rule which identifies the best population with a fixed minimum confidence using as few observations as possible.
arXiv Detail & Related papers (2021-06-12T20:05:29Z) - Bias-Robust Bayesian Optimization via Dueling Bandit [57.82422045437126]
We consider Bayesian optimization in settings where observations can be adversarially biased.
We propose a novel approach for dueling bandits based on information-directed sampling (IDS)
Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees.
arXiv Detail & Related papers (2021-05-25T10:08:41Z) - Causal Bandits without prior knowledge using separating sets [3.1000291317725]
The Causal Bandit is a variant of the classic Bandit problem where an agent must identify the best action in a sequential decision-making process.
Methods proposed for this problem thus far in the literature rely on exact prior knowledge of the full causal graph.
We formulate new causal bandit algorithms that no longer necessarily rely on prior causal knowledge.
arXiv Detail & Related papers (2020-09-16T20:08:03Z) - Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds.
Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z) - Best Arm Identification for Cascading Bandits in the Fixed Confidence
Setting [81.70513857417106]
We design and analyze CascadeBAI, an algorithm for finding the best set of $K$ items.
An upper bound on the time complexity of CascadeBAI is derived by overcoming a crucial analytical challenge.
We show, through the derivation of a lower bound on the time complexity, that the performance of CascadeBAI is optimal in some practical regimes.
arXiv Detail & Related papers (2020-01-23T16:47:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.