Related papers: Federated Best Arm Identification with Heterogeneous Clients

Federated Best Arm Identification with Heterogeneous Clients

URL: http://arxiv.org/abs/2210.07780v3
Date: Tue, 19 Dec 2023 09:31:06 GMT
Title: Federated Best Arm Identification with Heterogeneous Clients
Authors: Zhirui Chen, P. N. Karthik, Vincent Y. F. Tan, and Yeow Meng Chee
Abstract summary: We study best arm identification in a federated multi-armed bandit setting with a central server and multiple clients. We show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant (em almost-optimal algorithm) We propose a novel algorithm that communicates at exponential time instants, and demonstrate that it is almost-optimal.
Score: 62.36929749450298
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study best arm identification in a federated multi-armed bandit setting with a central server and multiple clients, when each client has access to a {\em subset} of arms and each arm yields independent Gaussian observations. The goal is to identify the best arm of each client subject to an upper bound on the error probability; here, the best arm is one that has the largest {\em average} value of the means averaged across all clients having access to the arm. Our interest is in the asymptotics as the error probability vanishes. We provide an asymptotic lower bound on the growth rate of the expected stopping time of any algorithm. Furthermore, we show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant ({\em almost-optimal} algorithm), the ratio of any two consecutive communication time instants must be {\em bounded}, a result that is of independent interest. We thereby infer that an algorithm can communicate no more sparsely than at exponential time instants in order to be almost-optimal. For the class of almost-optimal algorithms, we present the first-of-its-kind asymptotic lower bound on the expected number of {\em communication rounds} until stoppage. We propose a novel algorithm that communicates at exponential time instants, and demonstrate that it is asymptotically almost-optimal.

Related papers

Optimal Multi-Objective Best Arm Identification with Fixed Confidence [62.36929749450298]
We consider a multi-armed bandit setting in which each arm yields an $M$-dimensional vector reward upon selection. The end goal is to identify the best arm of em every objective in the shortest (expected) time subject to an upper bound on the probability of error. We propose an algorithm that uses the novel idea of em surrogate proportions to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step.
arXiv Detail & Related papers (2025-01-23T12:28:09Z)
Best-Arm Identification in Unimodal Bandits [24.001611176749158]
We study the fixed-confidence best-arm identification problem in unimodal bandits. We derive two lower on the stopping time of any bounds. We show that a linear dependence on the number of arms is unavoidable in the confidence-independent cost.
arXiv Detail & Related papers (2024-11-04T09:05:11Z)
Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification [39.751528705097414]
In a fixed confidence setting, an algorithm must stop data-dependently and return the estimated best arm with a correctness guarantee. We prove that this never-stopping event can indeed happen for some popular algorithms. The first algorithm is based on a fixed budget algorithm called Sequential Halving along with a doubling trick. The second algorithm is a meta algorithm that takes in any fixed confidence algorithm with a high probability stopping guarantee and turns it into one that enjoys an exponential-tailed stopping time.
arXiv Detail & Related papers (2024-11-04T05:26:05Z)
A General Framework for Clustering and Distribution Matching with Bandit Feedback [81.50716021326194]
We develop a general framework for clustering and distribution matching problems with bandit feedback. We derive a non-asymptotic lower bound on the average number of arm pulls for any online algorithm with an error probability not exceeding $delta$.
arXiv Detail & Related papers (2024-09-08T12:19:12Z)
Optimal Best Arm Identification with Fixed Confidence in Restless Bandits [66.700654953613]
We study best arm identification in a restless multi-armed bandit setting with finitely many arms. The discrete-time data generated by each arm forms a homogeneous Markov chain taking values in a common, finite state space. It is demonstrated that tracking the long-term behavior of a certain Markov decision process and its state-action visitation proportions are the key ingredients in analyzing the converse and achievability bounds.
arXiv Detail & Related papers (2023-10-20T10:04:05Z)
Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits [40.71199236098642]
In the infinite-armed bandit problem, each arm's average reward is sampled from an unknown distribution. We consider a general class of distribution functionals beyond the maximum, and propose unified meta algorithms for both the offline and online settings.
arXiv Detail & Related papers (2022-11-01T18:20:10Z)
Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback. It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines. We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
Optimal Best-arm Identification in Linear Bandits [79.3239137440876]
We devise a simple algorithm whose sampling complexity matches known instance-specific lower bounds. Unlike existing best-arm identification strategies, our algorithm uses a stopping rule that does not depend on the number of arms.
arXiv Detail & Related papers (2020-06-29T14:25:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.