An Embarrassingly Simple Backdoor Attack on Self-supervised Learning
- URL: http://arxiv.org/abs/2210.07346v2
- Date: Mon, 14 Aug 2023 01:07:38 GMT
- Title: An Embarrassingly Simple Backdoor Attack on Self-supervised Learning
- Authors: Changjiang Li, Ren Pang, Zhaohan Xi, Tianyu Du, Shouling Ji, Yuan Yao,
Ting Wang
- Abstract summary: Self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels.
We study the inherent vulnerability of SSL to backdoor attacks.
- Score: 52.28670953101126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a new paradigm in machine learning, self-supervised learning (SSL) is
capable of learning high-quality representations of complex data without
relying on labels. In addition to eliminating the need for labeled data,
research has found that SSL improves the adversarial robustness over supervised
learning since lacking labels makes it more challenging for adversaries to
manipulate model predictions. However, the extent to which this robustness
superiority generalizes to other types of attacks remains an open question.
We explore this question in the context of backdoor attacks. Specifically, we
design and evaluate CTRL, an embarrassingly simple yet highly effective
self-supervised backdoor attack. By only polluting a tiny fraction of training
data (<= 1%) with indistinguishable poisoning samples, CTRL causes any
trigger-embedded input to be misclassified to the adversary's designated class
with a high probability (>= 99%) at inference time. Our findings suggest that
SSL and supervised learning are comparably vulnerable to backdoor attacks. More
importantly, through the lens of CTRL, we study the inherent vulnerability of
SSL to backdoor attacks. With both empirical and analytical evidence, we reveal
that the representation invariance property of SSL, which benefits adversarial
robustness, may also be the very reason making \ssl highly susceptible to
backdoor attacks. Our findings also imply that the existing defenses against
supervised backdoor attacks are not easily retrofitted to the unique
vulnerability of SSL.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection [53.25863925815954]
Federated self-supervised learning (FSSL) has emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data.
While FSSL offers advantages, its susceptibility to backdoor attacks has not been investigated.
We propose the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models.
arXiv Detail & Related papers (2024-05-21T06:14:49Z) - Towards Adversarial Robustness And Backdoor Mitigation in SSL [0.562479170374811]
Self-Supervised Learning (SSL) has shown great promise in learning representations from unlabeled data.
SSL methods have recently been shown to be vulnerable to backdoor attacks.
This work aims to address defending against backdoor attacks in SSL.
arXiv Detail & Related papers (2024-03-23T19:21:31Z) - Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks.
Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms.
This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z) - Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking [65.44477004525231]
Researchers have recently found that Self-Supervised Learning (SSL) is vulnerable to backdoor attacks.
In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method.
Our method achieves 96% accuracy for backdoor trigger detection compared to 3% of the state-of-the-art method on poisoned ImageNet-100.
arXiv Detail & Related papers (2023-12-13T08:01:15Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - Targeted Forgetting and False Memory Formation in Continual Learners
through Adversarial Backdoor Attacks [2.830541450812474]
We explore the vulnerability of Elastic Weight Consolidation (EWC), a popular continual learning algorithm for avoiding catastrophic forgetting.
We show that an intelligent adversary can bypass the EWC's defenses, and instead cause gradual and deliberate forgetting by introducing small amounts of misinformation to the model during training.
We demonstrate such an adversary's ability to assume control of the model via injection of "backdoor" attack samples on both permuted and split benchmark variants of the MNIST dataset.
arXiv Detail & Related papers (2020-02-17T18:13:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.