Constrained Best Arm Identification with Tests for Feasibility
- URL: http://arxiv.org/abs/2511.09808v1
- Date: Fri, 14 Nov 2025 01:10:41 GMT
- Title: Constrained Best Arm Identification with Tests for Feasibility
- Authors: Ting Cai, Kirthevasan Kandasamy,
- Abstract summary: Best arm identification (BAI) aims to identify the highest-performance arm among a set of $K$ arms by collecting samples from each arm.<n>We propose an efficient algorithm and upper-bound its sample complexity, showing our algorithm can naturally adapt to the problem's difficulty.<n>We empirically show that our algorithm outperforms other state-of-the-art BAI algorithms in both synthetic and real-world datasets.
- Score: 7.968868442731488
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Best arm identification (BAI) aims to identify the highest-performance arm among a set of $K$ arms by collecting stochastic samples from each arm. In real-world problems, the best arm needs to satisfy additional feasibility constraints. While there is limited prior work on BAI with feasibility constraints, they typically assume the performance and constraints are observed simultaneously on each pull of an arm. However, this assumption does not reflect most practical use cases, e.g., in drug discovery, we wish to find the most potent drug whose toxicity and solubility are below certain safety thresholds. These safety experiments can be conducted separately from the potency measurement. Thus, this requires designing BAI algorithms that not only decide which arm to pull but also decide whether to test for the arm's performance or feasibility. In this work, we study feasible BAI which allows a decision-maker to choose a tuple $(i,\ell)$, where $i\in [K]$ denotes an arm and $\ell$ denotes whether she wishes to test for its performance ($\ell=0$) or any of its $N$ feasibility constraints ($\ell\in[N]$). We focus on the fixed confidence setting, which is to identify the \textit{feasible} arm with the \textit{highest performance}, with a probability of at least $1-δ$. We propose an efficient algorithm and upper-bound its sample complexity, showing our algorithm can naturally adapt to the problem's difficulty and eliminate arms by worse performance or infeasibility, whichever is easier. We complement this upper bound with a lower bound showing that our algorithm is \textit{asymptotically ($δ\rightarrow 0$) optimal}. Finally, we empirically show that our algorithm outperforms other state-of-the-art BAI algorithms in both synthetic and real-world datasets.
Related papers
- Rate-optimal Design for Anytime Best Arm Identification [19.803714682639903]
We consider the best arm identification problem, where the goal is to identify the arm with the highest mean reward from a set of $K$ arms under a limited sampling budget.<n>This problem models many practical scenarios such as A/B testing.<n>We consider a class of algorithms for this problem, which is provably minimax optimal up to a constant factor.
arXiv Detail & Related papers (2025-10-27T10:36:56Z) - Optimal Multi-Objective Best Arm Identification with Fixed Confidence [62.36929749450298]
We consider a multi-armed bandit setting in which each arm yields an $M$-dimensional vector reward upon selection.<n>The end goal is to identify the best arm of em every objective in the shortest (expected) time subject to an upper bound on the probability of error.<n>We propose an algorithm that uses the novel idea of em surrogate proportions to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step.
arXiv Detail & Related papers (2025-01-23T12:28:09Z) - Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.<n>We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.<n>Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z) - Optimal Top-Two Method for Best Arm Identification and Fluid Analysis [15.353009236788262]
We propose an optimal top-2 type algorithm for the best arm identification problem.<n>We show that the proposed algorithm is optimal as $delta rightarrow 0$.
arXiv Detail & Related papers (2024-03-14T06:14:07Z) - Cost Aware Best Arm Identification [13.380383930882784]
We call it emphCost Aware Best Arm Identification (CABAI)
We propose a simple algorithm called emphChernoff Overlap (CO), based on a square-root rule.
Our results show that (i.e. ignoring the heterogeneous action cost results in sub-optimality in practice, and (ii.) simple algorithms can deliver near-optimal performance over a wide range of problems.
arXiv Detail & Related papers (2024-02-26T16:27:08Z) - Optimal Batched Best Arm Identification [29.635422073014638]
We study the batched best arm identification (BBAI) problem, where the learner's goal is to identify the best arm while switching the policy as less as possible.<n>We propose the three-batch best arm identification (Tri-BBAI), which is the first batched algorithm that achieves the optimal sample complexity.<n>We further propose the almost optimal batched best arm identification (Opt-BBAI) algorithm, which is the first achieves that the near-optimal sample and batch complexity in the non-asymptotic setting.
arXiv Detail & Related papers (2023-10-21T22:55:50Z) - Contextual Combinatorial Bandits with Probabilistically Triggered Arms [55.9237004478033]
We study contextual bandits with probabilistically triggered arms (C$2$MAB-T) under a variety of smoothness conditions.
Under the triggering modulated (TPM) condition, we devise the C$2$-UC-T algorithm and derive a regret bound $tildeO(dsqrtT)$.
arXiv Detail & Related papers (2023-03-30T02:51:00Z) - PAC Best Arm Identification Under a Deadline [101.10352416022559]
We study $(epsilon, delta)$-PAC best arm identification, where a decision-maker must identify an optimal arm with probability at least $1 - delta$, while minimizing the number of arm pulls (samples)
In this work, the decision-maker is given a deadline of $T$ rounds, where, on each round, it can adaptively choose which arms to pull and how many times to pull them.
We propose Elastic Batch Racing (EBR), a novel algorithm for this setting and bound its sample complexity, showing that EBR is optimal with respect to both
arXiv Detail & Related papers (2021-06-06T19:48:32Z) - Towards Minimax Optimal Best Arm Identification in Linear Bandits [95.22854522340938]
We study the problem of best arm identification in linear bandits in the fixed-budget setting.
By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm.
We provide a theoretical analysis of the failure probability of OD-LinBAI.
arXiv Detail & Related papers (2021-05-27T09:19:10Z) - Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy [71.17938026619068]
We study the top-$k$ extreme contextual bandits problem, where the total number of arms can be enormous.
We first propose an algorithm for the non-extreme realizable setting, utilizing the Inverse Gap Weighting strategy.
We show that our algorithm has a regret guarantee of $O(ksqrt(A-k+1)T log (|mathcalF|T))$.
arXiv Detail & Related papers (2021-02-15T19:10:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.