A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2409.07775v1
- Date: Thu, 12 Sep 2024 06:17:37 GMT
- Title: A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning
- Authors: Yinbo Yu, Saihao Yan, Jiajia Liu,
- Abstract summary: cooperative multi-agent deep reinforcement learning (c-MADRL) is under the threat of backdoor attacks.
We propose a novel backdoor attack against c-MADRL, which attacks entire multi-agent team by embedding backdoor only in one agent.
Our backdoor attacks are able to reach a high attack success rate (91.6%) while maintaining a low clean performance variance rate (3.7%)
- Score: 12.535344011523897
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Recent studies have shown that cooperative multi-agent deep reinforcement learning (c-MADRL) is under the threat of backdoor attacks. Once a backdoor trigger is observed, it will perform abnormal actions leading to failures or malicious goals. However, existing proposed backdoors suffer from several issues, e.g., fixed visual trigger patterns lack stealthiness, the backdoor is trained or activated by an additional network, or all agents are backdoored. To this end, in this paper, we propose a novel backdoor attack against c-MADRL, which attacks the entire multi-agent team by embedding the backdoor only in a single agent. Firstly, we introduce adversary spatiotemporal behavior patterns as the backdoor trigger rather than manual-injected fixed visual patterns or instant status and control the attack duration. This method can guarantee the stealthiness and practicality of injected backdoors. Secondly, we hack the original reward function of the backdoored agent via reward reverse and unilateral guidance during training to ensure its adverse influence on the entire team. We evaluate our backdoor attacks on two classic c-MADRL algorithms VDN and QMIX, in a popular c-MADRL environment SMAC. The experimental results demonstrate that our backdoor attacks are able to reach a high attack success rate (91.6\%) while maintaining a low clean performance variance rate (3.7\%).
Related papers
- Act in Collusion: A Persistent Distributed Multi-Target Backdoor in Federated Learning [5.91728247370845]
Federated learning is vulnerable to backdoor attacks due to its distributed nature.
We propose a more practical threat model for federated learning: the distributed multi-target backdoor.
We show that 30 rounds after the attack, Attack Success rates of three different backdoors from various clients remain above 93%.
arXiv Detail & Related papers (2024-11-06T13:57:53Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers [51.0477382050976]
An extra prompt token, called the switch token in this work, can turn the backdoor mode on, converting a benign model into a backdoored one.
To attack a pre-trained model, our proposed attack, named SWARM, learns a trigger and prompt tokens including a switch token.
Experiments on diverse visual recognition tasks confirm the success of our switchable backdoor attack, achieving 95%+ attack success rate.
arXiv Detail & Related papers (2024-05-17T08:19:48Z) - BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting [21.91491621538245]
We propose and investigate a new characteristic of backdoor attacks, namely, backdoor exclusivity.
Backdoor exclusivity measures the ability of backdoor triggers to remain effective in the presence of input variation.
Our approach substantially enhances the stealthiness of four old-school backdoor attacks, at almost no cost of the attack success rate and normal utility.
arXiv Detail & Related papers (2023-12-08T08:35:16Z) - Recover Triggered States: Protect Model Against Backdoor Attack in
Reinforcement Learning [23.94769537680776]
A backdoor attack allows a malicious user to manipulate the environment or corrupt the training data, thus inserting a backdoor into the trained agent.
This paper proposes the Recovery Triggered States (RTS) method, a novel approach that effectively protects the victim agents from backdoor attacks.
arXiv Detail & Related papers (2023-04-01T08:00:32Z) - Look, Listen, and Attack: Backdoor Attacks Against Video Action
Recognition [53.720010650445516]
We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically.
In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain.
And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models.
arXiv Detail & Related papers (2023-01-03T07:40:28Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - Backdoors Stuck At The Frontdoor: Multi-Agent Backdoor Attacks That
Backfire [8.782809316491948]
We investigate a multi-agent backdoor attack scenario, where multiple attackers attempt to backdoor a victim model simultaneously.
A consistent backfiring phenomenon is observed across a wide range of games, where agents suffer from a low collective attack success rate.
The results motivate the re-evaluation of backdoor defense research for practical environments.
arXiv Detail & Related papers (2022-01-28T16:11:40Z) - Rethink Stealthy Backdoor Attacks in Natural Language Processing [35.6803390044542]
The capacity of stealthy backdoor attacks is overestimated when categorized as backdoor attacks.
We propose a new metric called attack successful rate difference (ASRD), which measures the ASR difference between clean state and poison state models.
Our method achieves significantly better performance than state-of-the-art defense methods against stealthy backdoor attacks.
arXiv Detail & Related papers (2022-01-09T12:34:12Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning [80.99426477001619]
We migrate backdoor attacks to more complex RL systems involving multiple agents.
As a proof of concept, we demonstrate that an adversary agent can trigger the backdoor of the victim agent with its own action.
The results show that when the backdoor is activated, the winning rate of the victim drops by 17% to 37% compared to when not activated.
arXiv Detail & Related papers (2021-05-02T23:47:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.