Related papers: Defending Observation Attacks in Deep Reinforcement Learning via Detection and Denoising

Defending Observation Attacks in Deep Reinforcement Learning via Detection and Denoising

URL: http://arxiv.org/abs/2206.07188v1
Date: Tue, 14 Jun 2022 22:28:30 GMT
Title: Defending Observation Attacks in Deep Reinforcement Learning via Detection and Denoising
Authors: Zikang Xiong, Joe Eappen, He Zhu, and Suresh Jagannathan
Abstract summary: Attacks manifesting as perturbations in the observation space managed by the external environment have been shown to downgrade policy performance. To defend against these attacks, we propose a novel defense strategy using a detect-and-denoise schema. Our solution does not require sampling data in an environment under attack, thereby greatly reducing risk during training.
Score: 3.2023814100005907
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural network policies trained using Deep Reinforcement Learning (DRL) are well-known to be susceptible to adversarial attacks. In this paper, we consider attacks manifesting as perturbations in the observation space managed by the external environment. These attacks have been shown to downgrade policy performance significantly. We focus our attention on well-trained deterministic and stochastic neural network policies in the context of continuous control benchmarks subject to four well-studied observation space adversarial attacks. To defend against these attacks, we propose a novel defense strategy using a detect-and-denoise schema. Unlike previous adversarial training approaches that sample data in adversarial scenarios, our solution does not require sampling data in an environment under attack, thereby greatly reducing risk during training. Detailed experimental results show that our technique is comparable with state-of-the-art adversarial training approaches.

Related papers

Toward Spiking Neural Network Local Learning Modules Resistant to Adversarial Attacks [2.3312335998006306]
Recent research has shown the vulnerability of Spiking Neural Networks (SNNs) under adversarial examples. We introduce a hybrid adversarial attack paradigm that leverages the transferability of adversarial instances. The proposed hybrid approach demonstrates superior performance, outperforming existing adversarial attack methods.
arXiv Detail & Related papers (2025-04-11T18:07:59Z)
Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation [10.411820336052784]
This study investigates behavior-targeted attacks on reinforcement learning and their countermeasures.<n>To the best of our knowledge, this is the first defense strategy specifically designed for behavior-targeted attacks.
arXiv Detail & Related papers (2024-06-06T08:49:51Z)
Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention. This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques. New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z)
Adversarial Robustness of Deep Reinforcement Learning based Dynamic Recommender Systems [50.758281304737444]
We propose to explore adversarial examples and attack detection on reinforcement learning-based interactive recommendation systems. We first craft different types of adversarial examples by adding perturbations to the input and intervening on the casual factors. Then, we augment recommendation systems by detecting potential attacks with a deep learning-based classifier based on the crafted data.
arXiv Detail & Related papers (2021-12-02T04:12:24Z)
Targeted Attack on Deep RL-based Autonomous Driving with Learned Visual Patterns [18.694795507945603]
Recent studies demonstrated the vulnerability of control policies learned through deep reinforcement learning against adversarial attacks. This paper investigates the feasibility of targeted attacks through visually learned patterns placed on physical object in the environment.
arXiv Detail & Related papers (2021-09-16T04:59:06Z)
Balancing detectability and performance of attacks on the control channel of Markov Decision Processes [77.66954176188426]
We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs) This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods.
arXiv Detail & Related papers (2021-09-15T09:13:10Z)
TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions. Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z)
Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning [114.9857000195174]
A major challenge to widespread industrial adoption of deep reinforcement learning is the potential vulnerability to privacy breaches. We propose an adversarial attack framework tailored for testing the vulnerability of deep reinforcement learning algorithms to membership inference attacks.
arXiv Detail & Related papers (2021-09-08T23:44:57Z)
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples. Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z)
Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning [32.12283927682007]
Deep reinforcement learning models are vulnerable to adversarial attacks which can decrease the victim's total reward by manipulating the observations. We reformulate the problem of adversarial attacks in function space and separate the previous gradient based attacks into several subspaces. In the first stage, we train a deceptive policy by hacking the environment, and discover a set of trajectories routing to the lowest reward. Our method provides a tighter theoretical upper bound for the attacked agent's performance than the existing approaches.
arXiv Detail & Related papers (2021-06-30T07:41:51Z)
Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs. We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.