Related papers: CuDA2: An approach for Incorporating Traitor Agents into Cooperative Multi-Agent Systems

CuDA2: An approach for Incorporating Traitor Agents into Cooperative Multi-Agent Systems

URL: http://arxiv.org/abs/2406.17425v1
Date: Tue, 25 Jun 2024 09:59:31 GMT
Title: CuDA2: An approach for Incorporating Traitor Agents into Cooperative Multi-Agent Systems
Authors: Zhen Chen, Yong Liao, Youpeng Zhao, Zipeng Dai, Jian Zhao,
Abstract summary: We introduce a novel method that involves injecting traitor agents into the CMARL system. In TMDP, traitors are trained using the same MARL algorithm as the victim agents, with their reward function set as the negative of the victim agents' reward. CuDA2 enhances the efficiency and aggressiveness of attacks on the specified victim agents' policies.
Score: 13.776447110639193
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cooperative Multi-Agent Reinforcement Learning (CMARL) strategies are well known to be vulnerable to adversarial perturbations. Previous works on adversarial attacks have primarily focused on white-box attacks that directly perturb the states or actions of victim agents, often in scenarios with a limited number of attacks. However, gaining complete access to victim agents in real-world environments is exceedingly difficult. To create more realistic adversarial attacks, we introduce a novel method that involves injecting traitor agents into the CMARL system. We model this problem as a Traitor Markov Decision Process (TMDP), where traitors cannot directly attack the victim agents but can influence their formation or positioning through collisions. In TMDP, traitors are trained using the same MARL algorithm as the victim agents, with their reward function set as the negative of the victim agents' reward. Despite this, the training efficiency for traitors remains low because it is challenging for them to directly associate their actions with the victim agents' rewards. To address this issue, we propose the Curiosity-Driven Adversarial Attack (CuDA2) framework. CuDA2 enhances the efficiency and aggressiveness of attacks on the specified victim agents' policies while maintaining the optimal policy invariance of the traitors. Specifically, we employ a pre-trained Random Network Distillation (RND) module, where the extra reward generated by the RND module encourages traitors to explore states unencountered by the victim agents. Extensive experiments on various scenarios from SMAC demonstrate that our CuDA2 framework offers comparable or superior adversarial attack capabilities compared to other baselines.

Related papers

MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents [60.30753230776882]
LLM agents are vulnerable to indirect prompt injection (IPI) attacks, where malicious tasks embedded in tool-retrieved information can redirect the agent to take unauthorized actions.<n>We present MELON, a novel IPI defense that detects attacks by re-executing the agent's trajectory with a masked user prompt modified through a masking function.
arXiv Detail & Related papers (2025-02-07T18:57:49Z)
Adversarial Inception for Bounded Backdoor Poisoning in Deep Reinforcement Learning [16.350898218047405]
We propose a new class of backdoor attacks against Deep Reinforcement Learning (DRL) algorithms. These attacks achieve state of the art performance while minimally altering the agent's rewards. We then devise an online attack which significantly out-performs prior attacks under bounded reward constraints.
arXiv Detail & Related papers (2024-10-17T19:50:28Z)
DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data. Recent attack methods can achieve a relatively high attack success rate (ASR) We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z)
Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception. We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception. We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z)
Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning [45.408568528354216]
We investigate the impact of adversarial attacks on multi-agent reinforcement learning (MARL) In the considered setup, there is an attacker who is able to modify the rewards before the agents receive them or manipulate the actions before the environment receives them. We show that the mixed attack strategy can efficiently attack MARL agents even if the attacker has no prior information about the underlying environment and the agents' algorithms.
arXiv Detail & Related papers (2023-07-15T00:38:55Z)
Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL [46.32591437241358]
In this paper, we consider a multi-agent setting where a well-trained victim agent is exploited by an attacker controlling another agent. Previous models do not account for the possibility that the attacker may only have partial control over $alpha$ or that the attack may produce easily detectable "abnormal" behaviors. We introduce a generalized attack framework that has the flexibility to model what extent the adversary is able to control the agent. We offer a provably efficient defense with convergence to the most robust victim policy through adversarial training with timescale separation.
arXiv Detail & Related papers (2023-05-27T02:54:07Z)
Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy [32.1138935956272]
Reinforcement learning agents are susceptible to evasion attacks during deployment. In this paper, we propose Intrinsically Motivated Adrial Policy (IMAP) for efficient black-box adversarial policy learning.
arXiv Detail & Related papers (2023-05-04T07:24:12Z)
Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority Influence [41.14664289570607]
Adrial Minority Influence (AMI) is a practical black-box attack and can be launched without knowing victim parameters. AMI is also strong by considering the complex multi-agent interaction and the cooperative goal of agents. We achieve the first successful attack against real-world robot swarms and effectively fool agents in simulated environments into collectively worst-case scenarios.
arXiv Detail & Related papers (2023-02-07T08:54:37Z)
Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars. Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks. We show such threats exist, even when the attacker only has access to the input/output of the model. We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z)
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary [86.0846119254031]
We study the robustness of reinforcement learning with adversarially perturbed state observations. With a fixed agent policy, we demonstrate that an optimal adversary to perturb state observations can be found. For DRL settings, this leads to a novel empirical adversarial attack to RL agents via a learned adversary that is much stronger than previous ones.
arXiv Detail & Related papers (2021-01-21T05:38:52Z)
On the Robustness of Cooperative Multi-Agent Reinforcement Learning [32.92198917228515]
In cooperative multi-agent reinforcement learning (c-MARL), agents learn to cooperatively take actions as a team to maximize a total team reward. We analyze the robustness of c-MARL to adversaries capable of attacking one of the agents on a team. By attacking a single agent, our attack method has highly negative impact on the overall team reward, reducing it from 20 to 9.4.
arXiv Detail & Related papers (2020-03-08T05:12:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.