Related papers: Towards Minimax Optimal Best Arm Identification in Linear Bandits

Towards Minimax Optimal Best Arm Identification in Linear Bandits

URL: http://arxiv.org/abs/2105.13017v1
Date: Thu, 27 May 2021 09:19:10 GMT
Title: Towards Minimax Optimal Best Arm Identification in Linear Bandits
Authors: Junwen Yang, Vincent Y. F. Tan
Abstract summary: We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm. We provide a theoretical analysis of the failure probability of OD-LinBAI.
Score: 95.22854522340938
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We provide a theoretical analysis of the failure probability of OD-LinBAI. While the performances of existing methods (e.g., BayesGap) depend on all the optimality gaps, OD-LinBAI depends on the gaps of the top $d$ arms, where $d$ is the effective dimension of the linear bandit instance. Furthermore, we present a minimax lower bound for this problem. The upper and lower bounds show that OD-LinBAI is minimax optimal up to multiplicative factors in the exponent. Finally, numerical experiments corroborate our theoretical findings.

Related papers

Optimal and Practical Batched Linear Bandit Algorithm [8.087699764574788]
We study the linear bandit problem under limited adaptivity, known as the batched linear bandit.<n>We propose textttBLAE, a novel batched algorithm that integrates arm elimination with regularized G-optimal design.<n>Our analysis introduces new techniques for batch-wise optimal design and refined concentration bounds.
arXiv Detail & Related papers (2025-07-11T09:29:28Z)
Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z)
Minimax Rate-Optimal Algorithms for High-Dimensional Stochastic Linear Bandits [1.2010968598596632]
We study the linear bandit problem with multiple arms over $T$ rounds.<n>We show that Lasso estimators are provably suboptimal in the sequential setting.<n>We propose a three-stage arm selection algorithm that uses thresholded Lasso as the main estimation method.
arXiv Detail & Related papers (2025-05-23T02:20:00Z)
Minimum Empirical Divergence for Sub-Gaussian Linear Bandits [10.750348548547704]
LinMED is a randomized algorithm that admits a closed-form computation of the arm sampling probabilities. Our empirical study shows that LinMED has a competitive performance with the state-of-the-art algorithms.
arXiv Detail & Related papers (2024-10-31T21:54:44Z)
Sparse Linear Bandits with Blocking Constraints [22.01704171400845]
We investigate the high-dimensional sparse linear bandits problem in a data-poor regime.<n>We show novel offline statistical guarantees of the lasso estimator for the linear model.<n>We propose a meta-algorithm based on corralling that does not need knowledge of optimal sparsity parameter $k$ at minimal cost to regret.
arXiv Detail & Related papers (2024-10-26T01:42:03Z)
Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z)
Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem. It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z)
Cost Aware Best Arm Identification [13.380383930882784]
We call it emphCost Aware Best Arm Identification (CABAI) We propose a simple algorithm called emphChernoff Overlap (CO), based on a square-root rule. Our results show that (i.e. ignoring the heterogeneous action cost results in sub-optimality in practice, and (ii.) simple algorithms can deliver near-optimal performance over a wide range of problems.
arXiv Detail & Related papers (2024-02-26T16:27:08Z)
Fixed-Budget Best-Arm Identification in Sparse Linear Bandits [69.6194614504832]
We study the best-arm identification problem in sparse linear bandits under the fixed-budget setting. We design a two-phase algorithm, Lasso and Optimal-Design- (Lasso-OD) based linear best-arm identification. We show that Lasso-OD is almost minimax optimal in the exponent.
arXiv Detail & Related papers (2023-11-01T12:32:17Z)
PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits [29.097522376094624]
We propose a simple and computationally efficient sparse linear estimation method called PopArt. We derive sparse linear bandit algorithms that enjoy improved regret upper bounds upon the state of the art.
arXiv Detail & Related papers (2022-10-25T19:13:20Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Fixed-Budget Best-Arm Identification in Structured Bandits [33.27743152847947]
Best-arm identification (BAI) in a fixed-budget setting is a bandit problem where the learning agent maximizes the probability of identifying the optimal (best) arm after a fixed number of observations. We propose a general tractable algorithm that incorporates the structure, by eliminating suboptimal arms based on their mean reward estimates from a joint generalization model.
arXiv Detail & Related papers (2021-06-09T01:32:43Z)
Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds. Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z)
Locally Differentially Private (Contextual) Bandits Learning [55.63825598391525]
We study locally differentially private (LDP) bandits learning in this paper. We propose simple black-box reduction frameworks that can solve a large family of context-free bandits learning problems with LDP guarantee.
arXiv Detail & Related papers (2020-06-01T04:02:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.