Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation
and Complexity Analysis
- URL: http://arxiv.org/abs/2208.00081v1
- Date: Fri, 29 Jul 2022 21:29:29 GMT
- Title: Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation
and Complexity Analysis
- Authors: Tao Li, Haozhe Lei, and Quanyan Zhu
- Abstract summary: This paper provides a game-theoretical underpinning for understanding this type of security risk.
We define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation.
We observe that a minor effort of the attacker can significantly deteriorate the learning performance.
- Score: 20.11993437283895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Meta reinforcement learning (meta RL), as a combination of meta-learning
ideas and reinforcement learning (RL), enables the agent to adapt to different
tasks using a few samples. However, this sampling-based adaptation also makes
meta RL vulnerable to adversarial attacks. By manipulating the reward feedback
from sampling processes in meta RL, an attacker can mislead the agent into
building wrong knowledge from training experience, which deteriorates the
agent's performance when dealing with different tasks after adaptation. This
paper provides a game-theoretical underpinning for understanding this type of
security risk. In particular, we formally define the sampling attack model as a
Stackelberg game between the attacker and the agent, which yields a minimax
formulation. It leads to two online attack schemes: Intermittent Attack and
Persistent Attack, which enable the attacker to learn an optimal sampling
attack, defined by an $\epsilon$-first-order stationary point, within
$\mathcal{O}(\epsilon^{-2})$ iterations. These attack schemes freeride the
learning progress concurrently without extra interactions with the environment.
By corroborating the convergence results with numerical experiments, we observe
that a minor effort of the attacker can significantly deteriorate the learning
performance, and the minimax approach can also help robustify the meta RL
algorithms.
Related papers
- Meta Stackelberg Game: Robust Federated Learning against Adaptive and Mixed Poisoning Attacks [15.199885837603576]
Federated learning (FL) is susceptible to a range of security threats.
We develop an efficient meta-learning approach to solve the game, leading to a robust and adaptive FL defense.
arXiv Detail & Related papers (2024-10-22T21:08:28Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Discriminative Adversarial Unlearning [40.30974185546541]
We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm.
We capitalize on the capabilities of strong Membership Inference Attacks (MIA) to facilitate the unlearning of specific samples from a trained model.
Our proposed algorithm closely approximates the ideal benchmark of retraining from scratch for both random sample forgetting and class-wise forgetting schemes.
arXiv Detail & Related papers (2024-02-10T03:04:57Z) - Optimal Attack and Defense for Reinforcement Learning [11.36770403327493]
In adversarial RL, an external attacker has the power to manipulate the victim agent's interaction with the environment.
We show the attacker's problem of designing a stealthy attack that maximizes its own expected reward.
We argue that the optimal defense policy for the victim can be computed as the solution to a Stackelberg game.
arXiv Detail & Related papers (2023-11-30T21:21:47Z) - Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.