Related papers: Implicit Poisoning Attacks in Two-Agent Reinforcement Learning: Adversarial Policies for Training-Time Attacks

Implicit Poisoning Attacks in Two-Agent Reinforcement Learning: Adversarial Policies for Training-Time Attacks

URL: http://arxiv.org/abs/2302.13851v1
Date: Mon, 27 Feb 2023 14:52:15 GMT
Title: Implicit Poisoning Attacks in Two-Agent Reinforcement Learning: Adversarial Policies for Training-Time Attacks
Authors: Mohammad Mohammadi, Jonathan N\"other, Debmalya Mandal, Adish Singla, Goran Radanovic
Abstract summary: In targeted poisoning attacks, an attacker manipulates an agent-environment interaction to force the agent into adopting a policy of interest, called target policy. We study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer. We develop an optimization framework for designing optimal attacks, where the cost of the attack measures how much the solution deviates from the assumed default policy of the peer agent.
Score: 21.97069271045167
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In targeted poisoning attacks, an attacker manipulates an agent-environment interaction to force the agent into adopting a policy of interest, called target policy. Prior work has primarily focused on attacks that modify standard MDP primitives, such as rewards or transitions. In this paper, we study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer. We develop an optimization framework for designing optimal attacks, where the cost of the attack measures how much the solution deviates from the assumed default policy of the peer agent. We further study the computational properties of this optimization framework. Focusing on a tabular setting, we show that in contrast to poisoning attacks based on MDP primitives (transitions and (unbounded) rewards), which are always feasible, it is NP-hard to determine the feasibility of implicit poisoning attacks. We provide characterization results that establish sufficient conditions for the feasibility of the attack problem, as well as an upper and a lower bound on the optimal cost of the attack. We propose two algorithmic approaches for finding an optimal adversarial policy: a model-based approach with tabular policies and a model-free approach with parametric/neural policies. We showcase the efficacy of the proposed algorithms through experiments.

Related papers

Online Poisoning Attack Against Reinforcement Learning under Black-box Environments [3.3971307007467773]
This paper proposes an online environment poisoning algorithm tailored for reinforcement learning agents operating in a black-box setting.<n>We first propose an attack scheme that is capable of poisoning the reward functions and state transitions.<n>A penalty-based method along with a bilevel reformulation is then employed to transform the problem into an unconstrained counterpart.
arXiv Detail & Related papers (2024-12-01T12:43:23Z)
Securing Recommender System via Cooperative Training [78.97620275467733]
We propose a general framework, Triple Cooperative Defense (TCD), which employs three cooperative models that mutually enhance data. Considering existing attacks struggle to balance bi-level optimization and efficiency, we revisit poisoning attacks in recommender systems. We put forth a Game-based Co-training Attack (GCoAttack), which frames the proposed CoAttack and TCD as a game-theoretic process.
arXiv Detail & Related papers (2024-01-23T12:07:20Z)
Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme. Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z)
Optimal Cost Constrained Adversarial Attacks For Multiple Agent Systems [6.69087470775851]
We formulate the problem of performing optimal adversarial agent-to-agent attacks using distributed attack agents. We propose an optimal method integrating within-step static constrained attack-resource allocation optimization and between-step dynamic programming. Our numerical results show that the proposed attacks can significantly reduce the rewards received by the attacked agents.
arXiv Detail & Related papers (2023-11-01T21:28:02Z)
Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy [32.1138935956272]
Reinforcement learning agents are susceptible to evasion attacks during deployment. In this paper, we propose Intrinsically Motivated Adrial Policy (IMAP) for efficient black-box adversarial policy learning.
arXiv Detail & Related papers (2023-05-04T07:24:12Z)
Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage. Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack. We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z)
Attacking and Defending Deep Reinforcement Learning Policies [3.6985039575807246]
We study robustness of DRL policies to adversarial attacks from the perspective of robust optimization. We propose a greedy attack algorithm, which tries to minimize the expected return of the policy without interacting with the environment, and a defense algorithm, which performs adversarial training in a max-min form.
arXiv Detail & Related papers (2022-05-16T12:47:54Z)
Poisoning Attack against Estimating from Pairwise Comparisons [140.9033911097995]
Attackers have strong motivation and incentives to manipulate the ranking list. Data poisoning attacks on pairwise ranking algorithms can be formalized as the dynamic and static games between the ranker and the attacker. We propose two efficient poisoning attack algorithms and establish the associated theoretical guarantees.
arXiv Detail & Related papers (2021-07-05T08:16:01Z)
Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs. We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)
Policy Teaching in Reinforcement Learning via Environment Poisoning Attacks [33.41280432984183]
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy that maximizes reward in infinite-horizon problem settings.
arXiv Detail & Related papers (2020-11-21T16:54:45Z)
Revisiting Membership Inference Under Realistic Assumptions [87.13552321332988]
We study membership inference in settings where some of the assumptions typically used in previous research are relaxed. This setting is more realistic than the balanced prior setting typically considered by researchers. We develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function.
arXiv Detail & Related papers (2020-05-21T20:17:42Z)
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning [33.41280432984183]
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy. As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings.
arXiv Detail & Related papers (2020-03-28T23:22:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.