Black-Box Targeted Reward Poisoning Attack Against Online Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2305.10681v1
- Date: Thu, 18 May 2023 03:37:29 GMT
- Title: Black-Box Targeted Reward Poisoning Attack Against Online Deep
Reinforcement Learning
- Authors: Yinglun Xu, Gagandeep Singh
- Abstract summary: We propose the first black-box targeted attack against online deep reinforcement learning through reward poisoning during training time.
Our attack is applicable to general environments with unknown dynamics learned by unknown algorithms.
- Score: 2.3526458707956643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose the first black-box targeted attack against online deep
reinforcement learning through reward poisoning during training time. Our
attack is applicable to general environments with unknown dynamics learned by
unknown algorithms and requires limited attack budgets and computational
resources. We leverage a general framework and find conditions to ensure
efficient attack under a general assumption of the learning algorithms. We show
that our attack is optimal in our framework under the conditions. We
experimentally verify that with limited budgets, our attack efficiently leads
the learning agent to various target policies under a diverse set of popular
DRL environments and state-of-the-art learners.
Related papers
- Universal Black-Box Reward Poisoning Attack against Offline Reinforcement Learning [4.629358641630161]
We study the problem of universal black-boxed reward poisoning attacks against general offline reinforcement learning with deep neural networks.
We propose the first universal black-box reward poisoning attack in the general offline RL setting.
arXiv Detail & Related papers (2024-02-15T04:08:49Z) - Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning [6.414910263179327]
We study reward poisoning attacks on online deep reinforcement learning (DRL)
We demonstrate the intrinsic vulnerability of state-of-the-art DRL algorithms by designing a general, black-box reward poisoning framework called adversarial MDP attacks.
Our results show that our attacks efficiently poison agents learning in several popular classical control and MuJoCo environments.
arXiv Detail & Related papers (2022-05-30T04:07:19Z) - Attacking and Defending Deep Reinforcement Learning Policies [3.6985039575807246]
We study robustness of DRL policies to adversarial attacks from the perspective of robust optimization.
We propose a greedy attack algorithm, which tries to minimize the expected return of the policy without interacting with the environment, and a defense algorithm, which performs adversarial training in a max-min form.
arXiv Detail & Related papers (2022-05-16T12:47:54Z) - LAS-AT: Adversarial Training with Learnable Attack Strategy [82.88724890186094]
"Learnable attack strategy", dubbed LAS-AT, learns to automatically produce attack strategies to improve the model robustness.
Our framework is composed of a target network that uses AEs for training to improve robustness and a strategy network that produces attack strategies to control the AE generation.
arXiv Detail & Related papers (2022-03-13T10:21:26Z) - Projective Ranking-based GNN Evasion Attacks [52.85890533994233]
Graph neural networks (GNNs) offer promising learning methods for graph-related tasks.
GNNs are at risk of adversarial attacks.
arXiv Detail & Related papers (2022-02-25T21:52:09Z) - Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the
Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers.
We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions.
We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z) - Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks [81.13338949407205]
Recent works show that optimal bandit algorithms are vulnerable to adversarial attacks and can fail completely in the presence of attacks.
Existing robust bandit algorithms only work for the non-contextual setting under the attack of rewards.
We provide the first robust bandit algorithm for linear contextual bandit setting under a fully adaptive and omniscient attack.
arXiv Detail & Related papers (2021-06-05T22:20:34Z) - Disturbing Reinforcement Learning Agents with Corrupted Rewards [62.997667081978825]
We analyze the effects of different attack strategies based on reward perturbations on reinforcement learning algorithms.
We show that smoothly crafting adversarial rewards are able to mislead the learner, and that using low exploration probability values, the policy learned is more robust to corrupt rewards.
arXiv Detail & Related papers (2021-02-12T15:53:48Z) - Policy Teaching in Reinforcement Learning via Environment Poisoning
Attacks [33.41280432984183]
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.
As a victim, we consider RL agents whose objective is to find a policy that maximizes reward in infinite-horizon problem settings.
arXiv Detail & Related papers (2020-11-21T16:54:45Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Policy Teaching via Environment Poisoning: Training-time Adversarial
Attacks against Reinforcement Learning [33.41280432984183]
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy.
As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings.
arXiv Detail & Related papers (2020-03-28T23:22:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.