Generic Outlier Detection in Multi-Armed Bandit
- URL: http://arxiv.org/abs/2007.07293v1
- Date: Tue, 14 Jul 2020 18:42:44 GMT
- Title: Generic Outlier Detection in Multi-Armed Bandit
- Authors: Yikun Ban and Jingrui He
- Abstract summary: We propose a novel pulling algorithm named GOLD to identify such generic outlier arms.
In the experiments conducted on both synthetic and real-world data sets, the proposed algorithm achieves 98 % accuracy.
- Score: 44.11480686973274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of outlier arm detection in multi-armed
bandit settings, which finds plenty of applications in many high-impact domains
such as finance, healthcare, and online advertising. For this problem, a
learner aims to identify the arms whose expected rewards deviate significantly
from most of the other arms. Different from existing work, we target the
generic outlier arms or outlier arm groups whose expected rewards can be
larger, smaller, or even in between those of normal arms. To this end, we start
by providing a comprehensive definition of such generic outlier arms and
outlier arm groups. Then we propose a novel pulling algorithm named GOLD to
identify such generic outlier arms. It builds a real-time neighborhood graph
based on upper confidence bounds and catches the behavior pattern of outliers
from normal arms. We also analyze its performance from various aspects. In the
experiments conducted on both synthetic and real-world data sets, the proposed
algorithm achieves 98 % accuracy while saving 83 % exploration cost on average
compared with state-of-the-art techniques.
Related papers
- Covariance Adaptive Best Arm Identification [0.0]
The goal is to identify the arm with the highest mean reward with a probability of at least 1 -- $delta$, while minimizing the number of arm pulls.
We propose a more flexible scenario where arms can be dependent and rewards can be sampled simultaneously.
This framework is relevant in various applications, such as clinical trials, where similarities between patients or drugs suggest underlying correlations.
arXiv Detail & Related papers (2023-06-05T06:57:09Z) - Differential Good Arm Identification [4.666048091337632]
This paper targets a variant of the multi-armed bandit problem called good arm identification (GAI)
GAI is a pure-exploration bandit problem with the goal to output as many good arms using as few samples as possible.
We propose DGAI - a differentiable good arm identification algorithm.
arXiv Detail & Related papers (2023-03-13T14:28:21Z) - Almost Cost-Free Communication in Federated Best Arm Identification [76.12303738941254]
We study the problem of best arm identification in a federated learning multi-armed bandit setup with a central server and multiple clients.
We propose a novel algorithm sc FedElim that is based on successive elimination and communicates only in exponential time steps.
arXiv Detail & Related papers (2022-08-19T08:37:09Z) - Best Arm Identification in Restless Markov Multi-Armed Bandits [85.55466536537293]
We study the problem of identifying the best arm in a multi-armed bandit environment.
A decision entity wishes to find the index of the best arm as quickly as possible, subject to an upper bound error probability.
We show that this policy achieves an upper bound that depends on $R$ and is monotonically non-increasing as $Rtoinfty$.
arXiv Detail & Related papers (2022-03-29T04:58:04Z) - Best Arm Identification under Additive Transfer Bandits [49.69203462561861]
We study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances.
We propose and theoretically analyze an LUCB-style algorithm to identify an $epsilon$-optimal target arm with high probability.
arXiv Detail & Related papers (2021-12-08T02:20:18Z) - Multi-Armed Bandits with Dependent Arms [18.81667618369821]
We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms.
Multiple arms are grouped together to form a cluster, and the reward distributions of arms belonging to the same cluster are known functions of an unknown parameter that is a characteristic of the cluster.
We develop learning algorithms based on the UCB principle which utilize these additional side observations appropriately while performing exploration-exploitation trade-off.
arXiv Detail & Related papers (2020-10-13T14:00:19Z) - Fair Algorithms for Multi-Agent Multi-Armed Bandits [29.68201160277817]
We propose a multi-agent variant of the classical multi-armed bandit problem.
The goal is not to learn the "best arm"; indeed, each agent may perceive a different arm to be the best for her personally.
We show that multi-agent variants of three classic multi-armed bandit algorithms achieve sublinear regret.
arXiv Detail & Related papers (2020-07-13T21:20:04Z) - Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean.
We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z) - The Price of Incentivizing Exploration: A Characterization via Thompson
Sampling and Sample Complexity [83.81297078039836]
We consider incentivized exploration: a version of multi-armed bandits where the choice of arms is controlled by self-interested agents.
We focus on the price of incentives: the loss in performance, broadly construed, incurred for the sake of incentive-compatibility.
arXiv Detail & Related papers (2020-02-03T04:58:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.