Understanding Adversarial Attacks on Observations in Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2106.15860v1
- Date: Wed, 30 Jun 2021 07:41:51 GMT
- Title: Understanding Adversarial Attacks on Observations in Deep Reinforcement
Learning
- Authors: You Qiaoben, Chengyang Ying, Xinning Zhou, Hang Su, Jun Zhu, Bo Zhang
- Abstract summary: Deep reinforcement learning models are vulnerable to adversarial attacks which can decrease the victim's total reward by manipulating the observations.
We reformulate the problem of adversarial attacks in function space and separate the previous gradient based attacks into several subspaces.
In the first stage, we train a deceptive policy by hacking the environment, and discover a set of trajectories routing to the lowest reward.
Our method provides a tighter theoretical upper bound for the attacked agent's performance than the existing approaches.
- Score: 32.12283927682007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works demonstrate that deep reinforcement learning (DRL) models are
vulnerable to adversarial attacks which can decrease the victim's total reward
by manipulating the observations. Compared with adversarial attacks in
supervised learning, it is much more challenging to deceive a DRL model since
the adversary has to infer the environmental dynamics. To address this issue,
we reformulate the problem of adversarial attacks in function space and
separate the previous gradient based attacks into several subspace. Following
the analysis of the function space, we design a generic two-stage framework in
the subspace where the adversary lures the agent to a target trajectory or a
deceptive policy. In the first stage, we train a deceptive policy by hacking
the environment, and discover a set of trajectories routing to the lowest
reward. The adversary then misleads the victim to imitate the deceptive policy
by perturbing the observations. Our method provides a tighter theoretical upper
bound for the attacked agent's performance than the existing approaches.
Extensive experiments demonstrate the superiority of our method and we achieve
the state-of-the-art performance on both Atari and MuJoCo environments.
Related papers
- Evaluating the Robustness of LiDAR Point Cloud Tracking Against Adversarial Attack [6.101494710781259]
We introduce a unified framework for conducting adversarial attacks within the context of 3D object tracking.
In addressing black-box attack scenarios, we introduce a novel transfer-based approach, the Target-aware Perturbation Generation (TAPG) algorithm.
Our experimental findings reveal a significant vulnerability in advanced tracking methods when subjected to both black-box and white-box attacks.
arXiv Detail & Related papers (2024-10-28T10:20:38Z) - Behavior-Targeted Attack on Reinforcement Learning with Limited Access to Victim's Policy [9.530897053573186]
We propose a novel method for manipulating the victim agent in the black-box.
Our attack method is formulated as a bi-level optimization problem that is reduced to a matching problem.
Empirical evaluations on several reinforcement learning benchmarks show that our proposed method has superior attack performance to baselines.
arXiv Detail & Related papers (2024-06-06T08:49:51Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - On the Difficulty of Defending Contrastive Learning against Backdoor
Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms.
Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z) - Defending Observation Attacks in Deep Reinforcement Learning via
Detection and Denoising [3.2023814100005907]
Attacks manifesting as perturbations in the observation space managed by the external environment have been shown to downgrade policy performance.
To defend against these attacks, we propose a novel defense strategy using a detect-and-denoise schema.
Our solution does not require sampling data in an environment under attack, thereby greatly reducing risk during training.
arXiv Detail & Related papers (2022-06-14T22:28:30Z) - Adversarial Robustness of Deep Reinforcement Learning based Dynamic
Recommender Systems [50.758281304737444]
We propose to explore adversarial examples and attack detection on reinforcement learning-based interactive recommendation systems.
We first craft different types of adversarial examples by adding perturbations to the input and intervening on the casual factors.
Then, we augment recommendation systems by detecting potential attacks with a deep learning-based classifier based on the crafted data.
arXiv Detail & Related papers (2021-12-02T04:12:24Z) - Targeted Attack on Deep RL-based Autonomous Driving with Learned Visual
Patterns [18.694795507945603]
Recent studies demonstrated the vulnerability of control policies learned through deep reinforcement learning against adversarial attacks.
This paper investigates the feasibility of targeted attacks through visually learned patterns placed on physical object in the environment.
arXiv Detail & Related papers (2021-09-16T04:59:06Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Robust Reinforcement Learning on State Observations with Learned Optimal
Adversary [86.0846119254031]
We study the robustness of reinforcement learning with adversarially perturbed state observations.
With a fixed agent policy, we demonstrate that an optimal adversary to perturb state observations can be found.
For DRL settings, this leads to a novel empirical adversarial attack to RL agents via a learned adversary that is much stronger than previous ones.
arXiv Detail & Related papers (2021-01-21T05:38:52Z) - Guided Adversarial Attack for Evaluating and Enhancing Adversarial
Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training.
We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries.
We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.