Related papers: Reward Poisoning Attack Against Offline Reinforcement Learning

Reward Poisoning Attack Against Offline Reinforcement Learning

URL: http://arxiv.org/abs/2402.09695v1
Date: Thu, 15 Feb 2024 04:08:49 GMT
Title: Reward Poisoning Attack Against Offline Reinforcement Learning
Authors: Yinglun Xu, Rohan Gumaste, Gagandeep Singh
Abstract summary: We study the problem of reward poisoning attacks against general offline reinforcement learning with deep neural networks for function approximation. To the best of our knowledge, we propose the first black-box reward poisoning attack in the general offline RL setting.
Score: 5.057241745123681
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of reward poisoning attacks against general offline reinforcement learning with deep neural networks for function approximation. We consider a black-box threat model where the attacker is completely oblivious to the learning algorithm and its budget is limited by constraining both the amount of corruption at each data point, and the total perturbation. We propose an attack strategy called `policy contrast attack'. The high-level idea is to make some low-performing policies appear as high-performing while making high-performing policies appear as low-performing. To the best of our knowledge, we propose the first black-box reward poisoning attack in the general offline RL setting. We provide theoretical insights on the attack design and empirically show that our attack is efficient against current state-of-the-art offline RL algorithms in different kinds of learning datasets.

Related papers

Adversarial Attacks on Online Learning to Rank with Stochastic Click Models [34.725468803108754]
We propose the first study of adversarial attacks on online learning to rank. The goal of the adversary is to misguide the online learning to rank algorithm to place the target item on top of the ranking list linear times to time horizon $T$ with a sublinear attack cost.
arXiv Detail & Related papers (2023-05-30T17:05:49Z)
Adversarial Attacks on Online Learning to Rank with Click Feedback [18.614785011987756]
Online learning to rank is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks. This paper studies attack strategies against multiple variants of OLTR. We propose a general attack strategy against any algorithm under the general click model.
arXiv Detail & Related papers (2023-05-26T16:28:26Z)
Black-Box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning [2.3526458707956643]
We propose the first black-box targeted attack against online deep reinforcement learning through reward poisoning during training time. Our attack is applicable to general environments with unknown dynamics learned by unknown algorithms.
arXiv Detail & Related papers (2023-05-18T03:37:29Z)
Adversarial Attacks on Adversarial Bandits [10.891819703383408]
We show that the attacker is able to mislead any no-regret adversarial bandit algorithm into selecting a suboptimal target arm. This result implies critical security concern in real-world bandit-based systems.
arXiv Detail & Related papers (2023-01-30T00:51:39Z)
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning [6.414910263179327]
We study reward poisoning attacks on online deep reinforcement learning (DRL) We demonstrate the intrinsic vulnerability of state-of-the-art DRL algorithms by designing a general, black-box reward poisoning framework called adversarial MDP attacks. Our results show that our attacks efficiently poison agents learning in several popular classical control and MuJoCo environments.
arXiv Detail & Related papers (2022-05-30T04:07:19Z)
Investigating Top-$k$ White-Box and Transferable Black-box Attack [75.13902066331356]
We show that stronger attack actually transfers better for the general top-$k$ ASR indicated by the interest class rank (ICR) after attack. We propose a new normalized CE loss that guides the logit to be updated in the direction of implicitly maximizing its rank distance from the ground-truth class.
arXiv Detail & Related papers (2022-03-30T15:02:27Z)
Projective Ranking-based GNN Evasion Attacks [52.85890533994233]
Graph neural networks (GNNs) offer promising learning methods for graph-related tasks. GNNs are at risk of adversarial attacks.
arXiv Detail & Related papers (2022-02-25T21:52:09Z)
Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers. We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions. We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs. We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency. RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.