Globally Optimal Algorithms for Fixed-Budget Best Arm Identification
- URL: http://arxiv.org/abs/2206.04646v2
- Date: Fri, 10 Jun 2022 00:59:52 GMT
- Title: Globally Optimal Algorithms for Fixed-Budget Best Arm Identification
- Authors: Junpei Komiyama, Taira Tsuchiya, Junya Honda
- Abstract summary: We characterize the optimal rate as a result of global optimization over all possible parameters.
We show that this rate is indeed achievable by introducing a conceptual algorithm called delayed optimal tracking (DOT)
- Score: 16.500749121196986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the fixed-budget best arm identification problem where the goal
is to find the arm of the largest mean with a fixed number of samples. It is
known that the probability of misidentifying the best arm is exponentially
small to the number of rounds. However, limited characterizations have been
discussed on the rate (exponent) of this value. In this paper, we characterize
the optimal rate as a result of global optimization over all possible
parameters. We introduce two rates, $R^{\mathrm{go}}$ and
$R^{\mathrm{go}}_{\infty}$, corresponding to lower bounds on the
misidentification probability, each of which is associated with a proposed
algorithm. The rate $R^{\mathrm{go}}$ is associated with
$R^{\mathrm{go}}$-tracking, which can be efficiently implemented by a neural
network and is shown to outperform existing algorithms. However, this rate
requires a nontrivial condition to be achievable. To deal with this issue, we
introduce the second rate $R^{\mathrm{go}}_\infty$. We show that this rate is
indeed achievable by introducing a conceptual algorithm called delayed optimal
tracking (DOT).
Related papers
- Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems.
We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z) - Optimal Top-Two Method for Best Arm Identification and Fluid Analysis [15.353009236788262]
We propose an optimal top-2 type algorithm for the best arm identification problem.
We show that the proposed algorithm is optimal as $delta rightarrow 0$.
arXiv Detail & Related papers (2024-03-14T06:14:07Z) - A/B Testing and Best-arm Identification for Linear Bandits with
Robustness to Non-stationarity [28.068960555415014]
We investigate the fixed-budget best-arm identification problem for linear bandits in a potentially non-stationary environment.
An algorithm will aim to correctly identify the best arm $x* := argmax_xinmathcalXxtopsum_t=1Ttheta_t$ with probability as high as possible.
arXiv Detail & Related papers (2023-07-27T19:03:36Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - On the Sample Complexity of Representation Learning in Multi-task
Bandits with Global and Local structure [77.60508571062958]
We investigate the sample complexity of learning the optimal arm for multi-task bandit problems.
Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor)
We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as $H(Glog(delta_G)+ Xlog(delta_H))$, with $X,G,H$ being, respectively, the number of tasks, representations and predictors.
arXiv Detail & Related papers (2022-11-28T08:40:12Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Robust Estimation for Random Graphs [47.07886511972046]
We study the problem of robustly estimating the parameter $p$ of an ErdHos-R'enyi random graph on $n$ nodes.
We give an inefficient algorithm with similar accuracy for all $gamma 1/2$, the information-theoretic limit.
arXiv Detail & Related papers (2021-11-09T18:43:25Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Bandits with many optimal arms [68.17472536610859]
We write $p*$ for the proportion of optimal arms and $Delta$ for the minimal mean-gap between optimal and sub-optimal arms.
We characterize the optimal learning rates both in the cumulative regret setting, and in the best-arm identification setting.
arXiv Detail & Related papers (2021-03-23T11:02:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.