Related papers: Cooperative Backdoor Attack in Decentralized Reinforcement Learning with Theoretical Guarantee

Cooperative Backdoor Attack in Decentralized Reinforcement Learning with Theoretical Guarantee

URL: http://arxiv.org/abs/2405.15245v1
Date: Fri, 24 May 2024 06:13:31 GMT
Title: Cooperative Backdoor Attack in Decentralized Reinforcement Learning with Theoretical Guarantee
Authors: Mengtong Gao, Yifei Zou, Zuyuan Zhang, Xiuzhen Cheng, Dongxiao Yu,
Abstract summary: The paper investigates a cooperative backdoor attack in a decentralized reinforcement learning scenario. Our method decomposes the backdoor behavior into multiple components according to the state space of RL. To the best of our knowledge, this is the first paper presenting a provable cooperative backdoor attack in decentralized reinforcement learning.
Score: 21.596629203866925
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The safety of decentralized reinforcement learning (RL) is a challenging problem since malicious agents can share their poisoned policies with benign agents. The paper investigates a cooperative backdoor attack in a decentralized reinforcement learning scenario. Differing from the existing methods that hide a whole backdoor attack behind their shared policies, our method decomposes the backdoor behavior into multiple components according to the state space of RL. Each malicious agent hides one component in its policy and shares its policy with the benign agents. When a benign agent learns all the poisoned policies, the backdoor attack is assembled in its policy. The theoretical proof is given to show that our cooperative method can successfully inject the backdoor into the RL policies of benign agents. Compared with the existing backdoor attacks, our cooperative method is more covert since the policy from each attacker only contains a component of the backdoor attack and is harder to detect. Extensive simulations are conducted based on Atari environments to demonstrate the efficiency and covertness of our method. To the best of our knowledge, this is the first paper presenting a provable cooperative backdoor attack in decentralized reinforcement learning.

Related papers

PNAct: Crafting Backdoor Attacks in Safe Reinforcement Learning [7.1572446944905375]
Reinforcement Learning (RL) is widely used in tasks where agents interact with an environment to maximize rewards.<n>We identify that Safe RL is vulnerable to backdoor attacks, which can manipulate agents into performing unsafe actions.<n>This paper highlights the potential risks associated with Safe RL and underscores the feasibility of such attacks.
arXiv Detail & Related papers (2025-07-01T06:59:59Z)
BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems [11.776860619017867]
cooperative multi-agent deep reinforcement learning (c-MADRL) is under the threat of backdoor attacks. We propose a novel backdoor leverage attack against c-MADRL, which attacks the entire multi-agent team by embedding the only backdoor in a single agent.
arXiv Detail & Related papers (2025-01-03T01:33:29Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
A Spatiotemporal Stealthy Backdoor Attack against Cooperative Multi-Agent Deep Reinforcement Learning [12.535344011523897]
cooperative multi-agent deep reinforcement learning (c-MADRL) is under the threat of backdoor attacks. We propose a novel backdoor attack against c-MADRL, which attacks entire multi-agent team by embedding backdoor only in one agent. Our backdoor attacks are able to reach a high attack success rate (91.6%) while maintaining a low clean performance variance rate (3.7%)
arXiv Detail & Related papers (2024-09-12T06:17:37Z)
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents [16.350898218047405]
Reinforcement learning (RL) is an actively growing field that is seeing increased usage in real-world, safety-critical applications. In this work we explore a particularly stealthy form of training-time attacks against RL -- backdoor poisoning. We formulate a novel poisoning attack framework which interlinks the adversary's objectives with those of finding an optimal policy.
arXiv Detail & Related papers (2024-05-30T23:31:25Z)
BAGEL: Backdoor Attacks against Federated Contrastive Learning [18.32846642838125]
We study the backdoor attack against Federated Contrastive Learning (FCL) as a pioneer research. malicious clients can successfully inject a backdoor into the global encoder by uploading poisoned local updates. We also investigate how to inject backdoors into multiple downstream models, in terms of two different backdoor attacks.
arXiv Detail & Related papers (2023-09-14T13:05:31Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Recover Triggered States: Protect Model Against Backdoor Attack in Reinforcement Learning [23.94769537680776]
A backdoor attack allows a malicious user to manipulate the environment or corrupt the training data, thus inserting a backdoor into the trained agent. This paper proposes the Recovery Triggered States (RTS) method, a novel approach that effectively protects the victim agents from backdoor attacks.
arXiv Detail & Related papers (2023-04-01T08:00:32Z)
Backdoors Stuck At The Frontdoor: Multi-Agent Backdoor Attacks That Backfire [8.782809316491948]
We investigate a multi-agent backdoor attack scenario, where multiple attackers attempt to backdoor a victim model simultaneously. A consistent backfiring phenomenon is observed across a wide range of games, where agents suffer from a low collective attack success rate. The results motivate the re-evaluation of backdoor defense research for practical environments.
arXiv Detail & Related papers (2022-01-28T16:11:40Z)
BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning [80.99426477001619]
We migrate backdoor attacks to more complex RL systems involving multiple agents. As a proof of concept, we demonstrate that an adversary agent can trigger the backdoor of the victim agent with its own action. The results show that when the backdoor is activated, the winning rate of the victim drops by 17% to 37% compared to when not activated.
arXiv Detail & Related papers (2021-05-02T23:47:55Z)
ONION: A Simple and Effective Defense Against Textual Backdoor Attacks [91.83014758036575]
Backdoor attacks are a kind of emergent training-time threat to deep neural networks (DNNs) In this paper, we propose a simple and effective textual backdoor defense named ONION. Experiments demonstrate the effectiveness of our model in defending BiLSTM and BERT against five different backdoor attacks.
arXiv Detail & Related papers (2020-11-20T12:17:21Z)
Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs. We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
Backdoor Learning: A Survey [75.59571756777342]
Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs) Backdoor learning is an emerging and rapidly growing research area. This paper presents the first comprehensive survey of this realm.
arXiv Detail & Related papers (2020-07-17T04:09:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.