Generic Outlier Detection in Multi-Armed Bandit
- URL: http://arxiv.org/abs/2007.07293v1
- Date: Tue, 14 Jul 2020 18:42:44 GMT
- Title: Generic Outlier Detection in Multi-Armed Bandit
- Authors: Yikun Ban and Jingrui He
- Abstract summary: We propose a novel pulling algorithm named GOLD to identify such generic outlier arms.
In the experiments conducted on both synthetic and real-world data sets, the proposed algorithm achieves 98 % accuracy.
- Score: 44.11480686973274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of outlier arm detection in multi-armed
bandit settings, which finds plenty of applications in many high-impact domains
such as finance, healthcare, and online advertising. For this problem, a
learner aims to identify the arms whose expected rewards deviate significantly
from most of the other arms. Different from existing work, we target the
generic outlier arms or outlier arm groups whose expected rewards can be
larger, smaller, or even in between those of normal arms. To this end, we start
by providing a comprehensive definition of such generic outlier arms and
outlier arm groups. Then we propose a novel pulling algorithm named GOLD to
identify such generic outlier arms. It builds a real-time neighborhood graph
based on upper confidence bounds and catches the behavior pattern of outliers
from normal arms. We also analyze its performance from various aspects. In the
experiments conducted on both synthetic and real-world data sets, the proposed
algorithm achieves 98 % accuracy while saving 83 % exploration cost on average
compared with state-of-the-art techniques.
Related papers
- Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting [67.1631453378926]
Graph-Triggered Bandits is a framework to generalize rested and restless bandits.
In this work, we focus on two specific types of monotonic bandits: rising, where the expected reward of an arm grows as the number of triggers increases, and rotting, where the opposite behavior occurs.
arXiv Detail & Related papers (2024-09-09T18:23:07Z) - A General Framework for Clustering and Distribution Matching with Bandit Feedback [81.50716021326194]
We develop a general framework for clustering and distribution matching problems with bandit feedback.
We derive a non-asymptotic lower bound on the average number of arm pulls for any online algorithm with an error probability not exceeding $delta$.
arXiv Detail & Related papers (2024-09-08T12:19:12Z) - Covariance Adaptive Best Arm Identification [0.0]
The goal is to identify the arm with the highest mean reward with a probability of at least 1 -- $delta$, while minimizing the number of arm pulls.
We propose a more flexible scenario where arms can be dependent and rewards can be sampled simultaneously.
This framework is relevant in various applications, such as clinical trials, where similarities between patients or drugs suggest underlying correlations.
arXiv Detail & Related papers (2023-06-05T06:57:09Z) - Differential Good Arm Identification [4.666048091337632]
This paper targets a variant of the multi-armed bandit problem called good arm identification (GAI)
GAI is a pure-exploration bandit problem with the goal to output as many good arms using as few samples as possible.
We propose DGAI - a differentiable good arm identification algorithm.
arXiv Detail & Related papers (2023-03-13T14:28:21Z) - Almost Cost-Free Communication in Federated Best Arm Identification [76.12303738941254]
We study the problem of best arm identification in a federated learning multi-armed bandit setup with a central server and multiple clients.
We propose a novel algorithm sc FedElim that is based on successive elimination and communicates only in exponential time steps.
arXiv Detail & Related papers (2022-08-19T08:37:09Z) - Best Arm Identification under Additive Transfer Bandits [49.69203462561861]
We study the setting when, despite the means being unknown, there is a known additive relationship between the source and target MAB instances.
We propose and theoretically analyze an LUCB-style algorithm to identify an $epsilon$-optimal target arm with high probability.
arXiv Detail & Related papers (2021-12-08T02:20:18Z) - Multi-Armed Bandits with Dependent Arms [18.81667618369821]
We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms.
Multiple arms are grouped together to form a cluster, and the reward distributions of arms belonging to the same cluster are known functions of an unknown parameter that is a characteristic of the cluster.
We develop learning algorithms based on the UCB principle which utilize these additional side observations appropriately while performing exploration-exploitation trade-off.
arXiv Detail & Related papers (2020-10-13T14:00:19Z) - Fair Algorithms for Multi-Agent Multi-Armed Bandits [29.68201160277817]
We propose a multi-agent variant of the classical multi-armed bandit problem.
The goal is not to learn the "best arm"; indeed, each agent may perceive a different arm to be the best for her personally.
We show that multi-agent variants of three classic multi-armed bandit algorithms achieve sublinear regret.
arXiv Detail & Related papers (2020-07-13T21:20:04Z) - Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean.
We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.