Learning to Attack: A Bandit Approach to Adversarial Context Poisoning
- URL: http://arxiv.org/abs/2603.00567v1
- Date: Sat, 28 Feb 2026 09:34:06 GMT
- Title: Learning to Attack: A Bandit Approach to Adversarial Context Poisoning
- Authors: Ray Telikani, Amir H. Gandomi,
- Abstract summary: We introduce AdvBandit, a black-box adaptive attack that formulates context poisoning as a continuous-armed bandit problem.<n>The attacker requires no access to the victim's internal parameters, reward function, or gradient information.<n>We provide theoretical guarantees, including sublinear attacker regret and lower bounds on victim regret linear in the number of attacks.
- Score: 5.82233544807519
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural contextual bandits are vulnerable to adversarial attacks, where subtle perturbations to rewards, actions, or contexts induce suboptimal decisions. We introduce AdvBandit, a black-box adaptive attack that formulates context poisoning as a continuous-armed bandit problem, enabling the attacker to jointly learn and exploit the victim's evolving policy. The attacker requires no access to the victim's internal parameters, reward function, or gradient information; instead, it constructs a surrogate model using a maximum-entropy inverse reinforcement learning module from observed context-action pairs and optimizes perturbations against this surrogate using projected gradient descent. An upper confidence bound-aware Gaussian process guides arm selection. An attack-budget control mechanism is also introduced to limit detection risk and overhead. We provide theoretical guarantees, including sublinear attacker regret and lower bounds on victim regret linear in the number of attacks. Experiments on three real-world datasets (Yelp, MovieLens, and Disin) against various victim contextual bandits demonstrate that our attack model achieves higher cumulative victim regret than state-of-the-art baselines.
Related papers
- Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm [44.622203626828345]
Sequential Recommenders, which exploit dynamic user intents through interaction sequences, are vulnerable to adversarial attacks.<n>This paper focuses on the Profile Pollution Attack that subtly contaminates partial user interactions to induce targeted mispredictions.<n>We propose a constrained reinforcement driven attack CREAT that synergizes a bi-level optimization framework with multi-reward reinforcement learning to balance adversarial efficacy and stealthiness.
arXiv Detail & Related papers (2025-11-12T15:00:52Z) - Provably Invincible Adversarial Attacks on Reinforcement Learning Systems: A Rate-Distortion Information-Theoretic Approach [22.90190828541341]
Reinforcement learning (RL) for the Markov Decision Process (MDP) has emerged in many security-related applications.<n>In this paper, we propose a provably invincible'' or uncounterable'' type of adversarial attack on RL.
arXiv Detail & Related papers (2025-10-15T17:48:19Z) - Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation [10.411820336052784]
This study investigates behavior-targeted attacks on reinforcement learning and their countermeasures.<n>To the best of our knowledge, this is the first defense strategy specifically designed for behavior-targeted attacks.
arXiv Detail & Related papers (2024-06-06T08:49:51Z) - Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Adversarial Attacks on Adversarial Bandits [10.891819703383408]
We show that the attacker is able to mislead any no-regret adversarial bandit algorithm into selecting a suboptimal target arm.
This result implies critical security concern in real-world bandit-based systems.
arXiv Detail & Related papers (2023-01-30T00:51:39Z) - Generalizable Black-Box Adversarial Attack with Meta Learning [54.196613395045595]
In black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful perturbation based on query feedback under a query budget.
We propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability.
The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance.
arXiv Detail & Related papers (2023-01-01T07:24:12Z) - Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars.
Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks.
We show such threats exist, even when the attacker only has access to the input/output of the model.
We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z) - Zero-Query Transfer Attacks on Context-Aware Object Detectors [95.18656036716972]
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results.
A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check.
We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check.
arXiv Detail & Related papers (2022-03-29T04:33:06Z) - When Are Linear Stochastic Bandits Attackable? [47.25702824488642]
This paper studies the attackability of a $k$-armed linear bandit environment.
We propose a two-stage attack method against LinUCB and Robust Phase Elimination.
arXiv Detail & Related papers (2021-10-18T04:12:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.