Related papers: Silent Killer: A Stealthy, Clean-Label, Black-Box Backdoor Attack

Silent Killer: A Stealthy, Clean-Label, Black-Box Backdoor Attack

URL: http://arxiv.org/abs/2301.02615v2
Date: Sun, 1 Oct 2023 16:32:23 GMT
Title: Silent Killer: A Stealthy, Clean-Label, Black-Box Backdoor Attack
Authors: Tzvi Lederer, Gallil Maimon and Lior Rokach
Abstract summary: We introduce Silent Killer, a novel attack that operates in clean-label, black-box settings. We investigate the use of universal adversarial perturbations as triggers in clean-label attacks. We find that gradient alignment for crafting the poison is required to ensure high success rates.
Score: 10.047470656294335
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Backdoor poisoning attacks pose a well-known risk to neural networks. However, most studies have focused on lenient threat models. We introduce Silent Killer, a novel attack that operates in clean-label, black-box settings, uses a stealthy poison and trigger and outperforms existing methods. We investigate the use of universal adversarial perturbations as triggers in clean-label attacks, following the success of such approaches under poison-label settings. We analyze the success of a naive adaptation and find that gradient alignment for crafting the poison is required to ensure high success rates. We conduct thorough experiments on MNIST, CIFAR10, and a reduced version of ImageNet and achieve state-of-the-art results.

Related papers

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources. Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker. Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z)
Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning [40.130762098868736]
We propose a method named Contrastive Shortcut Injection (CSI), by leveraging activation values, integrates trigger design and data selection strategies to craft stronger shortcut features. With extensive experiments on full-shot and few-shot text classification tasks, we empirically validate CSI's high effectiveness and high stealthiness at low poisoning rates.
arXiv Detail & Related papers (2024-03-30T20:02:36Z)
Certified Robustness to Clean-Label Poisoning Using Diffusion Denoising [56.04951180983087]
We present a certified defense to clean-label poisoning attacks under $ell$-norm.<n>Inspired by the adversarial robustness achieved by $randomized$ $smoothing, we show how an off-the-shelf diffusion denoising model can sanitize the tampered training data.
arXiv Detail & Related papers (2024-03-18T17:17:07Z)
UltraClean: A Simple Framework to Train Robust Neural Networks against Backdoor Attacks [19.369701116838776]
Backdoor attacks are emerging threats to deep neural networks. They typically embed malicious behaviors into a victim model by injecting poisoned samples. We propose UltraClean, a framework that simplifies the identification of poisoned samples.
arXiv Detail & Related papers (2023-12-17T09:16:17Z)
Attention-Enhancing Backdoor Attacks Against BERT-based Models [54.070555070629105]
Investigating the strategies of backdoor attacks will help to understand the model's vulnerability. We propose a novel Trojan Attention Loss (TAL) which enhances the Trojan behavior by directly manipulating the attention patterns.
arXiv Detail & Related papers (2023-10-23T01:24:56Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics. We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z)
Enhancing Clean Label Backdoor Attack with Two-phase Specific Triggers [6.772389744240447]
We propose a two-phase and image-specific triggers generation method to enhance clean-label backdoor attacks. Our approach can achieve a fantastic attack success rate(98.98%) with low poisoning rate, high stealthiness under many evaluation metrics and is resistant to backdoor defense methods.
arXiv Detail & Related papers (2022-06-10T05:34:06Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data. We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label" We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)
Backdooring and Poisoning Neural Networks with Image-Scaling Attacks [15.807243762876901]
We propose a novel strategy for hiding backdoor and poisoning attacks. Our approach builds on a recent class of attacks against image scaling. We show that backdoors and poisoning work equally well when combined with image-scaling attacks.
arXiv Detail & Related papers (2020-03-19T08:59:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.