Implicit Poisoning Attacks in Two-Agent Reinforcement Learning:
Adversarial Policies for Training-Time Attacks
- URL: http://arxiv.org/abs/2302.13851v1
- Date: Mon, 27 Feb 2023 14:52:15 GMT
- Title: Implicit Poisoning Attacks in Two-Agent Reinforcement Learning:
Adversarial Policies for Training-Time Attacks
- Authors: Mohammad Mohammadi, Jonathan N\"other, Debmalya Mandal, Adish Singla,
Goran Radanovic
- Abstract summary: In targeted poisoning attacks, an attacker manipulates an agent-environment interaction to force the agent into adopting a policy of interest, called target policy.
We study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer.
We develop an optimization framework for designing optimal attacks, where the cost of the attack measures how much the solution deviates from the assumed default policy of the peer agent.
- Score: 21.97069271045167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In targeted poisoning attacks, an attacker manipulates an agent-environment
interaction to force the agent into adopting a policy of interest, called
target policy. Prior work has primarily focused on attacks that modify standard
MDP primitives, such as rewards or transitions. In this paper, we study
targeted poisoning attacks in a two-agent setting where an attacker implicitly
poisons the effective environment of one of the agents by modifying the policy
of its peer. We develop an optimization framework for designing optimal
attacks, where the cost of the attack measures how much the solution deviates
from the assumed default policy of the peer agent. We further study the
computational properties of this optimization framework. Focusing on a tabular
setting, we show that in contrast to poisoning attacks based on MDP primitives
(transitions and (unbounded) rewards), which are always feasible, it is NP-hard
to determine the feasibility of implicit poisoning attacks. We provide
characterization results that establish sufficient conditions for the
feasibility of the attack problem, as well as an upper and a lower bound on the
optimal cost of the attack. We propose two algorithmic approaches for finding
an optimal adversarial policy: a model-based approach with tabular policies and
a model-free approach with parametric/neural policies. We showcase the efficacy
of the proposed algorithms through experiments.
Related papers
- Securing Recommender System via Cooperative Training [78.97620275467733]
We propose a general framework, Triple Cooperative Defense (TCD), which employs three cooperative models that mutually enhance data.
Considering existing attacks struggle to balance bi-level optimization and efficiency, we revisit poisoning attacks in recommender systems.
We put forth a Game-based Co-training Attack (GCoAttack), which frames the proposed CoAttack and TCD as a game-theoretic process.
arXiv Detail & Related papers (2024-01-23T12:07:20Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Optimal Cost Constrained Adversarial Attacks For Multiple Agent Systems [6.69087470775851]
We formulate the problem of performing optimal adversarial agent-to-agent attacks using distributed attack agents.
We propose an optimal method integrating within-step static constrained attack-resource allocation optimization and between-step dynamic programming.
Our numerical results show that the proposed attacks can significantly reduce the rewards received by the attacked agents.
arXiv Detail & Related papers (2023-11-01T21:28:02Z) - Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy [32.1138935956272]
Reinforcement learning agents are susceptible to evasion attacks during deployment.
In this paper, we propose Intrinsically Motivated Adrial Policy (IMAP) for efficient black-box adversarial policy learning.
arXiv Detail & Related papers (2023-05-04T07:24:12Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Attacking and Defending Deep Reinforcement Learning Policies [3.6985039575807246]
We study robustness of DRL policies to adversarial attacks from the perspective of robust optimization.
We propose a greedy attack algorithm, which tries to minimize the expected return of the policy without interacting with the environment, and a defense algorithm, which performs adversarial training in a max-min form.
arXiv Detail & Related papers (2022-05-16T12:47:54Z) - Poisoning Attack against Estimating from Pairwise Comparisons [140.9033911097995]
Attackers have strong motivation and incentives to manipulate the ranking list.
Data poisoning attacks on pairwise ranking algorithms can be formalized as the dynamic and static games between the ranker and the attacker.
We propose two efficient poisoning attack algorithms and establish the associated theoretical guarantees.
arXiv Detail & Related papers (2021-07-05T08:16:01Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Policy Teaching in Reinforcement Learning via Environment Poisoning
Attacks [33.41280432984183]
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.
As a victim, we consider RL agents whose objective is to find a policy that maximizes reward in infinite-horizon problem settings.
arXiv Detail & Related papers (2020-11-21T16:54:45Z) - Revisiting Membership Inference Under Realistic Assumptions [87.13552321332988]
We study membership inference in settings where some of the assumptions typically used in previous research are relaxed.
This setting is more realistic than the balanced prior setting typically considered by researchers.
We develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function.
arXiv Detail & Related papers (2020-05-21T20:17:42Z) - Policy Teaching via Environment Poisoning: Training-time Adversarial
Attacks against Reinforcement Learning [33.41280432984183]
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy.
As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings.
arXiv Detail & Related papers (2020-03-28T23:22:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.