Best Arm Identification with Resource Constraints
- URL: http://arxiv.org/abs/2402.19090v1
- Date: Thu, 29 Feb 2024 12:17:54 GMT
- Title: Best Arm Identification with Resource Constraints
- Authors: Zitian Li, Wang Chi Cheung
- Abstract summary: We study the Best Arm Identification with Resource Constraints (BAIwRC) problem.
The agent aims to identify the best arm under resource constraints, where resources are consumed for each arm pull.
We design and analyze the Successive Halving with Resource Rationing algorithm (SH-RR)
- Score: 6.236743421605786
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by the cost heterogeneity in experimentation across different
alternatives, we study the Best Arm Identification with Resource Constraints
(BAIwRC) problem. The agent aims to identify the best arm under resource
constraints, where resources are consumed for each arm pull. We make two novel
contributions. We design and analyze the Successive Halving with Resource
Rationing algorithm (SH-RR). The SH-RR achieves a near-optimal non-asymptotic
rate of convergence in terms of the probability of successively identifying an
optimal arm. Interestingly, we identify a difference in convergence rates
between the cases of deterministic and stochastic resource consumption.
Related papers
- Best Arm Identification with Minimal Regret [55.831935724659175]
Best arm identification problem elegantly amalgamates regret minimization and BAI.
Agent's goal is to identify the best arm with a prescribed confidence level.
Double KL-UCB algorithm achieves optimality as the confidence level tends to zero.
arXiv Detail & Related papers (2024-09-27T16:46:02Z) - Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.
We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.
Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z) - Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget [6.22018632187078]
We introduce the constrained best mixed arm identification (CBMAI) problem with a fixed budget.
The goal is to find the best mixed arm that maximizes the expected reward subject to constraints on the expected costs with a given learning budget $N$.
We provide a theoretical upper bound on the mis-identification (of the the support of the best mixed arm) probability and show that it decays exponentially in the budget $N$.
arXiv Detail & Related papers (2024-05-23T22:35:11Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Covariance Adaptive Best Arm Identification [0.0]
The goal is to identify the arm with the highest mean reward with a probability of at least 1 -- $delta$, while minimizing the number of arm pulls.
We propose a more flexible scenario where arms can be dependent and rewards can be sampled simultaneously.
This framework is relevant in various applications, such as clinical trials, where similarities between patients or drugs suggest underlying correlations.
arXiv Detail & Related papers (2023-06-05T06:57:09Z) - Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget [4.226118870861363]
We consider a constrained, pure exploration, multi-armed bandit formulation under a fixed budget.
We propose an algorithm called textscConstrained-SR based on the Successive Rejects framework.
We show that the associated decay rate is nearly optimal relative to an information theoretic lower bound in certain special cases.
arXiv Detail & Related papers (2022-11-27T08:58:16Z) - Semiparametric Best Arm Identification with Contextual Information [10.915684166086026]
We study best-arm identification with a fixed budget and contextual information in bandit problems.
We develop the "Contextual RS-AIPW strategy," which consists of the random sampling rule tracking a target allocation ratio and the recommendation rule.
Our proposed Contextual RS-AIPW strategy is optimal because the upper bound for the probability of misidentification matches the semi lower bound when the budget goes to infinity.
arXiv Detail & Related papers (2022-09-15T14:38:47Z) - Best Arm Identification under Additive Transfer Bandits [49.69203462561861]
We study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances.
We propose and theoretically analyze an LUCB-style algorithm to identify an $epsilon$-optimal target arm with high probability.
arXiv Detail & Related papers (2021-12-08T02:20:18Z) - Censored Semi-Bandits for Resource Allocation [12.450488894967773]
We consider the problem of sequentially allocating resources in a censored semi-bandits setup.
The loss depends on two hidden parameters, one specific to the arm but independent of the resource allocation, and the other depends on the allocated resource.
We derive optimal algorithms for our problem setting using known algorithms for MP-MAB and Combinatorial Semi-Bandits.
arXiv Detail & Related papers (2021-04-12T19:15:32Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.