Robust Stochastic Bandit Algorithms under Probabilistic Unbounded
Adversarial Attack
- URL: http://arxiv.org/abs/2002.07214v1
- Date: Mon, 17 Feb 2020 19:21:08 GMT
- Title: Robust Stochastic Bandit Algorithms under Probabilistic Unbounded
Adversarial Attack
- Authors: Ziwei Guan, Kaiyi Ji, Donald J Bucci Jr, Timothy Y Hu, Joseph Palombo,
Michael Liston, Yingbin Liang
- Abstract summary: This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks.
We propose a novel sample median-based and exploration-aided UCB algorithm (called med-E-UCB) and a median-based $epsilon$-greedy algorithm (called med-$epsilon$-greedy)
Both algorithms are provably robust to the aforementioned attack model. More specifically we show that both algorithms achieve $mathcalO(log T)$ pseudo-regret (i.e
- Score: 41.060507338755784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The multi-armed bandit formalism has been extensively studied under various
attack models, in which an adversary can modify the reward revealed to the
player. Previous studies focused on scenarios where the attack value either is
bounded at each round or has a vanishing probability of occurrence. These
models do not capture powerful adversaries that can catastrophically perturb
the revealed reward. This paper investigates the attack model where an
adversary attacks with a certain probability at each round, and its attack
value can be arbitrary and unbounded if it attacks. Furthermore, the attack
value does not necessarily follow a statistical distribution. We propose a
novel sample median-based and exploration-aided UCB algorithm (called
med-E-UCB) and a median-based $\epsilon$-greedy algorithm (called
med-$\epsilon$-greedy). Both of these algorithms are provably robust to the
aforementioned attack model. More specifically we show that both algorithms
achieve $\mathcal{O}(\log T)$ pseudo-regret (i.e., the optimal regret without
attacks). We also provide a high probability guarantee of $\mathcal{O}(\log T)$
regret with respect to random rewards and random occurrence of attacks. These
bounds are achieved under arbitrary and unbounded reward perturbation as long
as the attack probability does not exceed a certain constant threshold. We
provide multiple synthetic simulations of the proposed algorithms to verify
these claims and showcase the inability of existing techniques to achieve
sublinear regret. We also provide experimental results of the algorithm
operating in a cognitive radio setting using multiple software-defined radios.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - Replicability is Asymptotically Free in Multi-armed Bandits [45.729105054410745]
This work is motivated by the growing demand for reproducible machine learning.
In particular, we consider a replicable algorithm that ensures, with high probability, that the algorithm's sequence of actions is not affected by the randomness inherent in the dataset.
arXiv Detail & Related papers (2024-02-12T03:31:34Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Adversarial Attacks on Gaussian Process Bandits [47.84198626686564]
We propose various adversarial attack methods with differing assumptions on the attacker's strength and prior information.
Our goal is to understand adversarial attacks on GP bandits from both a theoretical and practical perspective.
We demonstrate that adversarial attacks on GP bandits can succeed in forcing the algorithm towards $mathcalR_rm target$ even with a low attack budget.
arXiv Detail & Related papers (2021-10-16T02:39:10Z) - Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models.
In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms.
CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z) - Action-Manipulation Attacks Against Stochastic Bandits: Attacks and
Defense [45.408568528354216]
We introduce a new class of attack named action-manipulation attack.
In this attack, an adversary can change the action signal selected by the user.
To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks.
arXiv Detail & Related papers (2020-02-19T04:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.