Hiding in Plain Sight: Differential Privacy Noise Exploitation for
Evasion-resilient Localized Poisoning Attacks in Multiagent Reinforcement
Learning
- URL: http://arxiv.org/abs/2307.00268v2
- Date: Thu, 13 Jul 2023 03:18:15 GMT
- Title: Hiding in Plain Sight: Differential Privacy Noise Exploitation for
Evasion-resilient Localized Poisoning Attacks in Multiagent Reinforcement
Learning
- Authors: Md Tamjid Hossain, Hung La
- Abstract summary: differential privacy (DP) has been introduced in cooperative multiagent reinforcement learning (CMARL) to safeguard the agents' privacy against adversarial inference during knowledge sharing.
We present an adaptive, privacy-exploiting, and evasion-resilient localized poisoning attack (PeLPA) that capitalizes on the inherent DP-noise to circumvent anomaly detection systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Lately, differential privacy (DP) has been introduced in cooperative
multiagent reinforcement learning (CMARL) to safeguard the agents' privacy
against adversarial inference during knowledge sharing. Nevertheless, we argue
that the noise introduced by DP mechanisms may inadvertently give rise to a
novel poisoning threat, specifically in the context of private knowledge
sharing during CMARL, which remains unexplored in the literature. To address
this shortcoming, we present an adaptive, privacy-exploiting, and
evasion-resilient localized poisoning attack (PeLPA) that capitalizes on the
inherent DP-noise to circumvent anomaly detection systems and hinder the
optimal convergence of the CMARL model. We rigorously evaluate our proposed
PeLPA attack in diverse environments, encompassing both non-adversarial and
multiple-adversarial contexts. Our findings reveal that, in a medium-scale
environment, the PeLPA attack with attacker ratios of 20% and 40% can lead to
an increase in average steps to goal by 50.69% and 64.41%, respectively.
Furthermore, under similar conditions, PeLPA can result in a 1.4x and 1.6x
computational time increase in optimal reward attainment and a 1.18x and 1.38x
slower convergence for attacker ratios of 20% and 40%, respectively.
Related papers
- Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs [3.7913442178940318]
Modern large language models (LLMs) exhibit critical vulnerabilities to poison pill attacks.
We demonstrate these attacks exploit inherent architectural properties of LLMs.
Our work establishes poison pills as both a security threat and diagnostic tool.
arXiv Detail & Related papers (2025-02-23T06:34:55Z) - GCP: Guarded Collaborative Perception with Spatial-Temporal Aware Malicious Agent Detection [11.336965062177722]
Collaborative perception is vulnerable to adversarial message attacks from malicious agents.
This paper reveals a novel blind area confusion (BAC) attack that compromises existing single-shot outlier-based detection methods.
We propose Guarded Collaborative Perception framework based on spatial-temporal aware malicious agent detection.
arXiv Detail & Related papers (2025-01-05T06:03:26Z) - Sub-optimal Learning in Meta-Classifier Attacks: A Study of Membership Inference on Differentially Private Location Aggregates [19.09251452596829]
We show that a significant gap exists between the expected attack accuracy given by DP and the empirical attack accuracy even with informed attackers.
We propose two new metric-based MIAs: the one-threshold attack and the two-threshold attack.
arXiv Detail & Related papers (2024-12-29T12:51:34Z) - Criticality and Safety Margins for Reinforcement Learning [53.10194953873209]
We seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users.
We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions.
We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality.
arXiv Detail & Related papers (2024-09-26T21:00:45Z) - Membership Inference Attacks Against In-Context Learning [26.57639819629732]
We present the first membership inference attack tailored for In-Context Learning (ICL)
We propose four attack strategies tailored to various constrained scenarios.
We investigate three potential defenses targeting data, instruction, and output.
arXiv Detail & Related papers (2024-09-02T17:23:23Z) - AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases [73.04652687616286]
We propose AgentPoison, the first backdoor attack targeting generic and RAG-based LLM agents by poisoning their long-term memory or RAG knowledge base.
Unlike conventional backdoor attacks, AgentPoison requires no additional model training or fine-tuning.
On each agent, AgentPoison achieves an average attack success rate higher than 80% with minimal impact on benign performance.
arXiv Detail & Related papers (2024-07-17T17:59:47Z) - BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models [57.5404308854535]
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions.
We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space.
Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations.
arXiv Detail & Related papers (2024-06-24T19:29:47Z) - Low-Cost Privacy-Aware Decentralized Learning [5.295018540083454]
This paper introduces ZIP-DL, a privacy-aware decentralized learning (DL) algorithm that exploits correlated noise to provide strong privacy protection against a local adversary.
We provide theoretical guarantees for both convergence speed and privacy guarantees, thereby making ZIP-DL applicable to practical scenarios.
arXiv Detail & Related papers (2024-03-18T13:53:17Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy [32.1138935956272]
Reinforcement learning agents are susceptible to evasion attacks during deployment.
In this paper, we propose Intrinsically Motivated Adrial Policy (IMAP) for efficient black-box adversarial policy learning.
arXiv Detail & Related papers (2023-05-04T07:24:12Z) - Safe Deployment for Counterfactual Learning to Rank with Exposure-Based
Risk Minimization [63.93275508300137]
We introduce a novel risk-aware Counterfactual Learning To Rank method with theoretical guarantees for safe deployment.
Our experimental results demonstrate the efficacy of our proposed method, which is effective at avoiding initial periods of bad performance when little data is available.
arXiv Detail & Related papers (2023-04-26T15:54:23Z) - A Risk-Sensitive Approach to Policy Optimization [21.684251937825234]
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy.
We propose a more direct approach whereby risk-sensitive objectives, specified in terms of the cumulative distribution function (CDF) of the distribution of full-episode rewards, are optimized.
We demonstrate that the use of moderately "pessimistic" risk profiles, which emphasize scenarios where the agent performs poorly, leads to enhanced exploration and a continual focus on addressing deficiencies.
arXiv Detail & Related papers (2022-08-19T00:55:05Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.