Related papers: Towards Imperceptible Backdoor Attack in Self-supervised Learning

Towards Imperceptible Backdoor Attack in Self-supervised Learning

URL: http://arxiv.org/abs/2405.14672v1
Date: Thu, 23 May 2024 15:08:31 GMT
Title: Towards Imperceptible Backdoor Attack in Self-supervised Learning
Authors: Hanrong Zhang, Zhenting Wang, Tingxu Han, Mingyu Jin, Chenlu Zhan, Mengnan Du, Hongwei Wang, Shiqing Ma,
Abstract summary: Self-supervised learning models are vulnerable to backdoor attacks. Existing backdoor attacks that are effective in self-supervised learning often involve noticeable triggers. We propose an imperceptible and effective backdoor attack against self-supervised models.
Score: 34.107940147916835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised learning models are vulnerable to backdoor attacks. Existing backdoor attacks that are effective in self-supervised learning often involve noticeable triggers, like colored patches, which are vulnerable to human inspection. In this paper, we propose an imperceptible and effective backdoor attack against self-supervised models. We first find that existing imperceptible triggers designed for supervised learning are not as effective in compromising self-supervised models. We then identify this ineffectiveness is attributed to the overlap in distributions between the backdoor and augmented samples used in self-supervised learning. Building on this insight, we design an attack using optimized triggers that are disentangled to the augmented transformation in the self-supervised learning, while also remaining imperceptible to human vision. Experiments on five datasets and seven SSL algorithms demonstrate our attack is highly effective and stealthy. It also has strong resistance to existing backdoor defenses. Our code can be found at https://github.com/Zhang-Henry/IMPERATIVE.

Related papers

DLP: towards active defense against backdoor attacks with decoupled learning process [2.686336957004475]
We propose a general training pipeline to defend against backdoor attacks. We show that the model shows different learning behaviors in clean and poisoned subsets during training. The effectiveness of our approach has been shown in numerous experiments across various backdoor attacks and datasets.
arXiv Detail & Related papers (2024-06-18T23:04:38Z)
EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection [53.25863925815954]
Federated self-supervised learning (FSSL) has emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data. While FSSL offers advantages, its susceptibility to backdoor attacks has not been investigated. We propose the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models.
arXiv Detail & Related papers (2024-05-21T06:14:49Z)
Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks. Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms. This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z)
On the Difficulty of Defending Contrastive Learning against Backdoor Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms. Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z)
Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking [65.44477004525231]
Researchers have recently found that Self-Supervised Learning (SSL) is vulnerable to backdoor attacks. In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method. Our method achieves 96% accuracy for backdoor trigger detection compared to 3% of the state-of-the-art method on poisoned ImageNet-100.
arXiv Detail & Related papers (2023-12-13T08:01:15Z)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses. We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z)
Exploiting Machine Unlearning for Backdoor Attacks in Deep Learning System [4.9233610638625604]
We propose a novel black-box backdoor attack based on machine unlearning. The attacker first augments the training set with carefully designed samples, including poison and mitigation data, to train a benign' model. Then, the attacker posts unlearning requests for the mitigation samples to remove the impact of relevant data on the model, gradually activating the hidden backdoor.
arXiv Detail & Related papers (2023-09-12T02:42:39Z)
Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation. Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them. We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Towards Understanding How Self-training Tolerates Data Backdoor Poisoning [11.817302291033725]
We explore the potential of self-training via additional unlabeled data for mitigating backdoor attacks. We find that the new self-training regime help in defending against backdoor attacks to a great extent.
arXiv Detail & Related papers (2023-01-20T16:36:45Z)
Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics. We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z)
An Embarrassingly Simple Backdoor Attack on Self-supervised Learning [52.28670953101126]
Self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels. We study the inherent vulnerability of SSL to backdoor attacks.
arXiv Detail & Related papers (2022-10-13T20:39:21Z)
Backdoor Attacks on Self-Supervised Learning [22.24046752858929]
We show that self-supervised learning methods are vulnerable to backdoor attacks. An attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images. We propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack.
arXiv Detail & Related papers (2021-05-21T04:22:05Z)
Backdoor Attack in the Physical World [49.64799477792172]
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs) Most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images. We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2021-04-06T08:37:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.