Related papers: The Batch Complexity of Bandit Pure Exploration

The Batch Complexity of Bandit Pure Exploration

URL: http://arxiv.org/abs/2502.01425v1
Date: Mon, 03 Feb 2025 15:03:45 GMT
Title: The Batch Complexity of Bandit Pure Exploration
Authors: Adrienne Tuynman, Rémy Degenne,
Abstract summary: In a pure exploration problem in multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions.<n>We are interested in batched methods, which change their sampling behaviour only a few times, between batches of observations.<n>We give an instance-dependent lower bound on the number of batches used by any sample efficient algorithm for any pure exploration task.
Score: 10.036727981085223
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are interested in batched methods, which change their sampling behaviour only a few times, between batches of observations. We give an instance-dependent lower bound on the number of batches used by any sample efficient algorithm for any pure exploration task. We then give a general batched algorithm and prove upper bounds on its expected sample complexity and batch complexity. We illustrate both lower and upper bounds on best-arm identification and thresholding bandits.

Related papers

Breaking the $\log(1/Δ_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids [28.547030766096956]
We introduce an algorithm that achieves near-optimal sample complexity and features an instance-sensitive batch complexity.<n>We extend this framework to the problem of batched best arm identification in linear bandits and achieve similar improvements.
arXiv Detail & Related papers (2025-01-29T01:40:36Z)
A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit [55.2480439325792]
We study the real-valued pure exploration problem in the multi-armed bandit (R-CPE-MAB)<n>We introduce an algorithm named the gap-based exploration (CombGapE) algorithm, whose sample complexity upper bound matches the lower bound up to a problem-dependent constant factor.<n>We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-06-15T15:37:31Z)
Some performance considerations when using multi-armed bandit algorithms in the presence of missing data [1.0499611180329804]
When using multi-armed bandit algorithms, the potential impact of missing data is often overlooked. We investigate the impact of missing data on several bandit algorithms via a simulation study assuming the rewards are missing at random.
arXiv Detail & Related papers (2022-05-08T09:20:10Z)
Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback. It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines. We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z)
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits [0.0]
We study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms' observations. We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition.
arXiv Detail & Related papers (2021-11-18T14:34:21Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit Feedback [51.21673420940346]
Combinatorial bandits generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set. We focus on the pure-exploration problem of identifying the best arm with fixed confidence, as well as a more general setting, where the structure of the answer set differs from the one of the action set. Based on a projection-free online learning algorithm for finite polytopes, it is the first computationally efficient algorithm which is convexally optimal and has competitive empirical performance.
arXiv Detail & Related papers (2021-01-21T10:35:09Z)
The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity [83.81297078039836]
We consider incentivized exploration: a version of multi-armed bandits where the choice of arms is controlled by self-interested agents. We focus on the price of incentives: the loss in performance, broadly construed, incurred for the sake of incentive-compatibility.
arXiv Detail & Related papers (2020-02-03T04:58:51Z)
The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator. We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors. Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.