Best arm identification in rare events
- URL: http://arxiv.org/abs/2303.07627v1
- Date: Tue, 14 Mar 2023 04:51:24 GMT
- Title: Best arm identification in rare events
- Authors: Anirban Bhattacharjee, Sushant Vijayan and Sandeep K Juneja
- Abstract summary: A key application of this framework is in online advertising where click rates of advertisements could be a fraction of a single percent and final conversion to sales, while highly profitable, may again be a small fraction of the click rates.
Lately, algorithms for BAI problems have been developed that minimise sample complexity while providing statistical guarantees on the correct arm selection.
We exploit the fact that the reward process for each arm is well approximated by a Compound Poisson process to arrive at algorithms that are faster, with a small increase in sample complexity.
- Score: 0.43012765978447565
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We consider the best arm identification problem in the stochastic multi-armed
bandit framework where each arm has a tiny probability of realizing large
rewards while with overwhelming probability the reward is zero. A key
application of this framework is in online advertising where click rates of
advertisements could be a fraction of a single percent and final conversion to
sales, while highly profitable, may again be a small fraction of the click
rates. Lately, algorithms for BAI problems have been developed that minimise
sample complexity while providing statistical guarantees on the correct arm
selection. As we observe, these algorithms can be computationally prohibitive.
We exploit the fact that the reward process for each arm is well approximated
by a Compound Poisson process to arrive at algorithms that are faster, with a
small increase in sample complexity. We analyze the problem in an asymptotic
regime as rarity of reward occurrence reduces to zero, and reward amounts
increase to infinity. This helps illustrate the benefits of the proposed
algorithm. It also sheds light on the underlying structure of the optimal BAI
algorithms in the rare event setting.
Related papers
- A General Framework for Clustering and Distribution Matching with Bandit Feedback [81.50716021326194]
We develop a general framework for clustering and distribution matching problems with bandit feedback.
We derive a non-asymptotic lower bound on the average number of arm pulls for any online algorithm with an error probability not exceeding $delta$.
arXiv Detail & Related papers (2024-09-08T12:19:12Z) - Representative Arm Identification: A fixed confidence approach to identify cluster representatives [7.459521930846415]
We study the representative arm identification problem in the multi-armed bandits (MAB) framework.
The RAI problem covers as special cases several well-studied MAB problems such as identifying the best arm or any $M$ out of the top $K$.
We propose two algorithms, based on the idea of confidence intervals, and provide high probability upper bounds on their sample complexity.
arXiv Detail & Related papers (2024-08-26T11:47:52Z) - Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.
We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.
Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z) - lil'HDoC: An Algorithm for Good Arm Identification under Small Threshold
Gap [4.666048091337632]
Good arm identification (GAI) is a pure-exploration bandit problem in which a single learner outputs an arm as soon as it is identified as a good arm.
This paper focuses on the GAI problem under a small threshold gap, which refers to the distance between the expected rewards of arms and the given threshold.
We propose a new algorithm called lil'HDoC to significantly improve the total sample complexity of the HDoC algorithm.
arXiv Detail & Related papers (2024-01-29T04:21:47Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Federated Best Arm Identification with Heterogeneous Clients [62.36929749450298]
We study best arm identification in a federated multi-armed bandit setting with a central server and multiple clients.
We show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant (em almost-optimal algorithm)
We propose a novel algorithm that communicates at exponential time instants, and demonstrate that it is almost-optimal.
arXiv Detail & Related papers (2022-10-14T13:09:11Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds.
Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z) - On Regret with Multiple Best Arms [12.315392649501101]
We study a regret problem with the existence of multiple best/near-optimal arms in a bandit setting.
Our goal is to design algorithms that can automatically adapt to the unknown hardness of the problem.
arXiv Detail & Related papers (2020-06-26T04:01:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.