Related papers: Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification

Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification

URL: http://arxiv.org/abs/2411.01808v1
Date: Mon, 04 Nov 2024 05:26:05 GMT
Title: Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification
Authors: Kapilan Balagopalan, Tuan Ngo Nguyen, Yao Zhao, Kwang-Sung Jun,
Abstract summary: In a fixed confidence setting, an algorithm must stop data-dependently and return the estimated best arm with a correctness guarantee. We prove that this never-stopping event can indeed happen for some popular algorithms. The first algorithm is based on a fixed budget algorithm called Sequential Halving along with a doubling trick. The second algorithm is a meta algorithm that takes in any fixed confidence algorithm with a high probability stopping guarantee and turns it into one that enjoys an exponential-tailed stopping time.
Score: 39.751528705097414
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The best arm identification problem requires identifying the best alternative (i.e., arm) in active experimentation using the smallest number of experiments (i.e., arm pulls), which is crucial for cost-efficient and timely decision-making processes. In the fixed confidence setting, an algorithm must stop data-dependently and return the estimated best arm with a correctness guarantee. Since this stopping time is random, we desire its distribution to have light tails. Unfortunately, many existing studies focus on high probability or in expectation bounds on the stopping time, which allow heavy tails and, for high probability bounds, even not stopping at all. We first prove that this never-stopping event can indeed happen for some popular algorithms. Motivated by this, we propose algorithms that provably enjoy an exponential-tailed stopping time, which improves upon the polynomial tail bound reported by Kalyanakrishnan et al. (2012). The first algorithm is based on a fixed budget algorithm called Sequential Halving along with a doubling trick. The second algorithm is a meta algorithm that takes in any fixed confidence algorithm with a high probability stopping guarantee and turns it into one that enjoys an exponential-tailed stopping time. Our results imply that there is much more to be desired for contemporary fixed confidence algorithms.

Related papers

Optimal Multi-Objective Best Arm Identification with Fixed Confidence [62.36929749450298]
We consider a multi-armed bandit setting in which each arm yields an $M$-dimensional vector reward upon selection. The end goal is to identify the best arm of em every objective in the shortest (expected) time subject to an upper bound on the probability of error. We propose an algorithm that uses the novel idea of em surrogate proportions to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step.
arXiv Detail & Related papers (2025-01-23T12:28:09Z)
Best-Arm Identification in Unimodal Bandits [24.001611176749158]
We study the fixed-confidence best-arm identification problem in unimodal bandits. We derive two lower on the stopping time of any bounds. We show that a linear dependence on the number of arms is unavoidable in the confidence-independent cost.
arXiv Detail & Related papers (2024-11-04T09:05:11Z)
Private Edge Density Estimation for Random Graphs: Optimal, Efficient and Robust [5.037313459134419]
We give the first-time, differentially node-private, and robust algorithm for estimating the edge density of ErdHos-R'enyi random graphs. We prove information-theoretical lower bounds, showing that the error rate of our algorithm is optimal up to logarithmic factors.
arXiv Detail & Related papers (2024-05-26T18:59:44Z)
Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms [1.104960878651584]
This paper considers the problem of deriving concentration tail-bounds on the runtime/regret of algorithms. It provides a novel drift theorem that gives precise exponential tail-bounds given positive, weak, zero and even negative drift. Our drift theorem can be used to prove a strong concentration of the runtime/regret of algorithms in AI.
arXiv Detail & Related papers (2024-05-07T16:45:15Z)
Replicability is Asymptotically Free in Multi-armed Bandits [45.729105054410745]
This work is motivated by the growing demand for reproducible machine learning. In particular, we consider a replicable algorithm that ensures, with high probability, that the algorithm's sequence of actions is not affected by the randomness inherent in the dataset.
arXiv Detail & Related papers (2024-02-12T03:31:34Z)
Federated Best Arm Identification with Heterogeneous Clients [62.36929749450298]
We study best arm identification in a federated multi-armed bandit setting with a central server and multiple clients. We show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant (em almost-optimal algorithm) We propose a novel algorithm that communicates at exponential time instants, and demonstrate that it is almost-optimal.
arXiv Detail & Related papers (2022-10-14T13:09:11Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism [107.48538091418412]
We study exploration in multi-armed bandits when we have access to a divisible resource that can be allocated in varying amounts to arm pulls. We focus in particular on the allocation of distributed computing resources, where we may obtain results faster by allocating more resources per pull.
arXiv Detail & Related papers (2020-10-31T18:19:29Z)
Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds. Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.