Related papers: Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

URL: http://arxiv.org/abs/2402.05878v1
Date: Thu, 8 Feb 2024 18:13:26 GMT
Title: Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Authors: Nicolas Nguyen, Imad Aouali, Andr\'as Gy\"orgy, Claire Vernade
Abstract summary: We study the problem of Bayesian fixed-budget best-arm identification (BAI) in structured bandits. We propose an algorithm that uses fixed allocations based on the prior information and the structure of the environment.
Score: 5.362453227879925
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of Bayesian fixed-budget best-arm identification (BAI) in structured bandits. We propose an algorithm that uses fixed allocations based on the prior information and the structure of the environment. We provide theoretical bounds on its performance across diverse models, including the first prior-dependent upper bounds for linear and hierarchical BAI. Our key contribution is introducing new proof methods that result in tighter bounds for multi-armed BAI compared to existing methods. We extensively compare our approach to other fixed-budget BAI methods, demonstrating its consistent and robust performance in various settings. Our work improves our understanding of Bayesian fixed-budget BAI in structured bandits and highlights the effectiveness of our approach in practical scenarios.

Related papers

Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits [1.4732811715354452]
We consider a $K$-armed bandit problem in which each arm consumes a different amount of resources when selected. We propose a series of algorithms that are randomized like Thompson Sampling but more carefully optimize their decisions with respect to the budget constraint.
arXiv Detail & Related papers (2024-08-28T04:56:06Z)
UCB Exploration for Fixed-Budget Bayesian Best Arm Identification [0.0]
We study best-arm identification (BAI) in the fixed-budget setting. We propose an UCB exploration algorithm that is both theoretically and empirically efficient for the fixed budget BAI problem under a Bayesian setting.
arXiv Detail & Related papers (2024-08-09T05:15:36Z)
Bayesian Fixed-Budget Best-Arm Identification [24.31655036648236]
Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations. We propose a Bayesian elimination algorithm and derive an upper bound on its probability of misidentifying the optimal arm.
arXiv Detail & Related papers (2022-11-15T23:29:51Z)
Best Arm Identification under Additive Transfer Bandits [49.69203462561861]
We study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances. We propose and theoretically analyze an LUCB-style algorithm to identify an $epsilon$-optimal target arm with high probability.
arXiv Detail & Related papers (2021-12-08T02:20:18Z)
Bayesian decision-making under misspecified priors with applications to meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits. We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z)
Fixed-Budget Best-Arm Identification in Structured Bandits [33.27743152847947]
Best-arm identification (BAI) in a fixed-budget setting is a bandit problem where the learning agent maximizes the probability of identifying the optimal (best) arm after a fixed number of observations. We propose a general tractable algorithm that incorporates the structure, by eliminating suboptimal arms based on their mean reward estimates from a joint generalization model.
arXiv Detail & Related papers (2021-06-09T01:32:43Z)
Towards Minimax Optimal Best Arm Identification in Linear Bandits [95.22854522340938]
We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm. We provide a theoretical analysis of the failure probability of OD-LinBAI.
arXiv Detail & Related papers (2021-05-27T09:19:10Z)
Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms. Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z)
Posterior Differential Regularization with f-divergence for Improving Model Robustness [95.05725916287376]
We focus on methods that regularize the model posterior difference between clean and noisy inputs. We generalize the posterior differential regularization to the family of $f$-divergences. Our experiments show that regularizing the posterior differential with $f$-divergence can result in well-improved model robustness.
arXiv Detail & Related papers (2020-10-23T19:58:01Z)
Structure Adaptive Algorithms for Stochastic Bandits [22.871155520200773]
We study reward maximisation in a class of structured multi-armed bandit problems. The mean rewards of arms satisfy some given structural constraints. We develop algorithms from instance-dependent lower-bounds using iterative saddle-point solvers.
arXiv Detail & Related papers (2020-07-02T08:59:54Z)
Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds. Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.