Related papers: Adversarial Attacks on Linear Contextual Bandits

Adversarial Attacks on Linear Contextual Bandits

URL: http://arxiv.org/abs/2002.03839v3
Date: Fri, 23 Oct 2020 08:18:10 GMT
Title: Adversarial Attacks on Linear Contextual Bandits
Authors: Evrard Garcelon, Baptiste Roziere, Laurent Meunier, Jean Tarbouriech, Olivier Teytaud, Alessandro Lazaric, Matteo Pirotta
Abstract summary: Malicious agents may have incentives to attack the bandit algorithm to induce it to perform a desired behavior. We show that a malicious agent can force a linear contextual bandit algorithm to pull any desired arm $T - o(T)$ times over a horizon of $T$ steps. We also investigate the case when a malicious agent is interested in affecting the behavior of the bandit algorithm in a single context.
Score: 87.08004581867537
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contextual bandit algorithms are applied in a wide range of domains, from advertising to recommender systems, from clinical trials to education. In many of these domains, malicious agents may have incentives to attack the bandit algorithm to induce it to perform a desired behavior. For instance, an unscrupulous ad publisher may try to increase their own revenue at the expense of the advertisers; a seller may want to increase the exposure of their products, or thwart a competitor's advertising campaign. In this paper, we study several attack scenarios and show that a malicious agent can force a linear contextual bandit algorithm to pull any desired arm $T - o(T)$ times over a horizon of $T$ steps, while applying adversarial modifications to either rewards or contexts that only grow logarithmically as $O(\log T)$. We also investigate the case when a malicious agent is interested in affecting the behavior of the bandit algorithm in a single context (e.g., a specific user). We first provide sufficient conditions for the feasibility of the attack and we then propose an efficient algorithm to perform the attack. We validate our theoretical results on experiments performed on both synthetic and real-world datasets.

Related papers

Online Bidding under RoS Constraints without Knowing the Value [22.193658401789033]
We consider the problem of bidding in online advertising, where an advertiser aims to maximize value while adhering to budget and Return-on-Spend constraints. We propose a novel Upper Confidence Bound (UCB)-style algorithm that carefully manages this trade-off.
arXiv Detail & Related papers (2025-03-05T05:25:54Z)
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM) We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary. We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z)
Adversarial Attacks on Adversarial Bandits [10.891819703383408]
We show that the attacker is able to mislead any no-regret adversarial bandit algorithm into selecting a suboptimal target arm. This result implies critical security concern in real-world bandit-based systems.
arXiv Detail & Related papers (2023-01-30T00:51:39Z)
Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system. Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations. While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z)
Efficient Action Poisoning Attacks on Linear Contextual Bandits [41.1063033715314]
We propose a new class of attacks: action poisoning attacks. An adversary can change the action signal selected by the agent. We show that, in both white-box and black-box settings, the proposed attack schemes can force the LinUCB agent to pull a target arm very frequently.
arXiv Detail & Related papers (2021-12-10T07:39:07Z)
When Are Linear Stochastic Bandits Attackable? [47.25702824488642]
This paper studies the attackability of a $k$-armed linear bandit environment. We propose a two-stage attack method against LinUCB and Robust Phase Elimination.
arXiv Detail & Related papers (2021-10-18T04:12:09Z)
Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks [81.13338949407205]
Recent works show that optimal bandit algorithms are vulnerable to adversarial attacks and can fail completely in the presence of attacks. Existing robust bandit algorithms only work for the non-contextual setting under the attack of rewards. We provide the first robust bandit algorithm for linear contextual bandit setting under a fully adaptive and omniscient attack.
arXiv Detail & Related papers (2021-06-05T22:20:34Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
Stochastic Bandits with Linear Constraints [69.757694218456]
We study a constrained contextual linear bandit setting, where the goal of the agent is to produce a sequence of policies. We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB)
arXiv Detail & Related papers (2020-06-17T22:32:19Z)
Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense [45.408568528354216]
We introduce a new class of attack named action-manipulation attack. In this attack, an adversary can change the action signal selected by the user. To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks.
arXiv Detail & Related papers (2020-02-19T04:09:15Z)
Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack [41.060507338755784]
This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks. We propose a novel sample median-based and exploration-aided UCB algorithm (called med-E-UCB) and a median-based $epsilon$-greedy algorithm (called med-$epsilon$-greedy) Both algorithms are provably robust to the aforementioned attack model. More specifically we show that both algorithms achieve $mathcalO(log T)$ pseudo-regret (i.e
arXiv Detail & Related papers (2020-02-17T19:21:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.