Learning with a Budget: Identifying the Best Arm with Resource Constraints
- URL: http://arxiv.org/abs/2602.24146v1
- Date: Fri, 27 Feb 2026 16:21:33 GMT
- Title: Learning with a Budget: Identifying the Best Arm with Resource Constraints
- Authors: Zitian Li, Wang Chi Cheung,
- Abstract summary: We study the Best Arm Identification with Resource Constraints (BAIwRC) problem, where an agent seeks to identify the best alternative (aka arm) in the presence of resource constraints.<n>First, we propose the Successive Halving with Resource Rationing (SH-RR) algorithm, which integrates resource-aware allocation into the classical successive framework on best arm identification.
- Score: 8.450904497835262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many applications, evaluating the effectiveness of different alternatives comes with varying costs or resource usage. Motivated by such heterogeneity, we study the Best Arm Identification with Resource Constraints (BAIwRC) problem, where an agent seeks to identify the best alternative (aka arm) in the presence of resource constraints. Each arm pull consumes one or more types of limited resources. We make two key contributions. First, we propose the Successive Halving with Resource Rationing (SH-RR) algorithm, which integrates resource-aware allocation into the classical successive halving framework on best arm identification. The SH-RR algorithm unifies the theoretical analysis for both the stochastic and deterministic consumption settings, with a new \textit{effective consumption measure
Related papers
- Multiple-play Stochastic Bandits with Prioritized Arm Capacity Sharing [52.124267908936396]
The model is composed of $M$ arms and $K$ plays.<n>Each arm has a number of capacities, and each unit of capacity is associated with a reward function.<n>When multiple plays compete for the arm capacity, the arm capacity is allocated in a larger priority weight first manner.
arXiv Detail & Related papers (2025-12-25T11:19:09Z) - Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.<n>We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.<n>Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z) - Best Arm Identification with Resource Constraints [5.279257531335345]
We study the Best Arm Identification with Resource Constraints (BAIwRC) problem.<n>The agent aims to identify the best arm under resource constraints, where resources are consumed for each arm pull.<n>We design and analyze the Successive Halving with Resource Rationing algorithm (SH-RR)
arXiv Detail & Related papers (2024-02-29T12:17:54Z) - Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget [4.226118870861363]
We consider a constrained, pure exploration, multi-armed bandit formulation under a fixed budget.
We propose an algorithm called textscConstrained-SR based on the Successive Rejects framework.
We show that the associated decay rate is nearly optimal relative to an information theoretic lower bound in certain special cases.
arXiv Detail & Related papers (2022-11-27T08:58:16Z) - Resource-Aware Distributed Submodular Maximization: A Paradigm for
Multi-Robot Decision-Making [3.5788754401889022]
Resource-Aware distributed Greedy is the first algorithm to account for each robot's on-board resources independently.
RAG balances the trade-off of centralization, for global near-optimality, vs. decentralization, for near-minimal on-board computation, communication, and memory resources.
arXiv Detail & Related papers (2022-04-15T15:47:05Z) - Best Arm Identification under Additive Transfer Bandits [49.69203462561861]
We study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances.
We propose and theoretically analyze an LUCB-style algorithm to identify an $epsilon$-optimal target arm with high probability.
arXiv Detail & Related papers (2021-12-08T02:20:18Z) - Censored Semi-Bandits for Resource Allocation [12.450488894967773]
We consider the problem of sequentially allocating resources in a censored semi-bandits setup.
The loss depends on two hidden parameters, one specific to the arm but independent of the resource allocation, and the other depends on the allocated resource.
We derive optimal algorithms for our problem setting using known algorithms for MP-MAB and Combinatorial Semi-Bandits.
arXiv Detail & Related papers (2021-04-12T19:15:32Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - Hierarchical Adaptive Contextual Bandits for Resource Constraint based
Recommendation [49.69139684065241]
Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems.
In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint.
arXiv Detail & Related papers (2020-04-02T17:04:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.