Optimal Thresholding Linear Bandit
- URL: http://arxiv.org/abs/2402.09467v1
- Date: Sun, 11 Feb 2024 21:25:14 GMT
- Title: Optimal Thresholding Linear Bandit
- Authors: Eduardo Ochoa Rivera and Ambuj Tewari
- Abstract summary: We study a novel pure exploration problem: the $epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in linear bandits.
We extend an algorithm designed for Best Arm Identification in the case to TBP that is complexityally optimal.
- Score: 21.2523248114561
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit
Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a
lower bound for the sample complexity and extend an algorithm designed for Best
Arm Identification in the linear case to TBP that is asymptotically optimal.
Related papers
- Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits [55.938644481736446]
Indexed Minimum Empirical Divergence (IMED) is a highly effective approach to the multi-armed bandit problem.
It has been observed to empirically outperform UCB-based algorithms and Thompson Sampling.
We present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms.
arXiv Detail & Related papers (2024-05-24T04:11:58Z) - LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding
Linear Bandit Problem [4.666048091337632]
We present LinearAPT, a novel algorithm designed for the fixed budget setting of the Thresholding Linear Bandit (TLB) problem.
Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.
arXiv Detail & Related papers (2024-03-10T15:01:50Z) - An Optimal Algorithm for the Real-Valued Combinatorial Pure Exploration
of Multi-Armed Bandit [65.268245109828]
We study the real-valued pure exploration problem in the multi-armed bandit (R-CPE-MAB)
Existing methods in the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits.
We propose an algorithm named the gap-based exploration (CombGapE) algorithm, whose sample complexity matches the lower bound.
arXiv Detail & Related papers (2023-06-15T15:37:31Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms.
Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Gamification of Pure Exploration for Linear Bandits [34.16123941778227]
We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear bandits.
Whileally optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive.
We design the first insightally optimal algorithm for fixed-confidence pure exploration in linear bandits.
arXiv Detail & Related papers (2020-07-02T08:20:35Z) - Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds.
Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z) - An Empirical Process Approach to the Union Bound: Practical Algorithms
for Combinatorial and Linear Bandits [34.06611065493047]
This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings.
We provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms.
We also propose the first algorithm for linear bandits in the the fixed budget setting.
arXiv Detail & Related papers (2020-06-21T00:56:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.