Related papers: An Embarrassingly Simple Backdoor Attack on Self-supervised Learning

An Embarrassingly Simple Backdoor Attack on Self-supervised Learning

URL: http://arxiv.org/abs/2210.07346v2
Date: Mon, 14 Aug 2023 01:07:38 GMT
Title: An Embarrassingly Simple Backdoor Attack on Self-supervised Learning
Authors: Changjiang Li, Ren Pang, Zhaohan Xi, Tianyu Du, Shouling Ji, Yuan Yao, Ting Wang
Abstract summary: Self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels. We study the inherent vulnerability of SSL to backdoor attacks.
Score: 52.28670953101126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As a new paradigm in machine learning, self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels. In addition to eliminating the need for labeled data, research has found that SSL improves the adversarial robustness over supervised learning since lacking labels makes it more challenging for adversaries to manipulate model predictions. However, the extent to which this robustness superiority generalizes to other types of attacks remains an open question. We explore this question in the context of backdoor attacks. Specifically, we design and evaluate CTRL, an embarrassingly simple yet highly effective self-supervised backdoor attack. By only polluting a tiny fraction of training data (<= 1%) with indistinguishable poisoning samples, CTRL causes any trigger-embedded input to be misclassified to the adversary's designated class with a high probability (>= 99%) at inference time. Our findings suggest that SSL and supervised learning are comparably vulnerable to backdoor attacks. More importantly, through the lens of CTRL, we study the inherent vulnerability of SSL to backdoor attacks. With both empirical and analytical evidence, we reveal that the representation invariance property of SSL, which benefits adversarial robustness, may also be the very reason making \ssl highly susceptible to backdoor attacks. Our findings also imply that the existing defenses against supervised backdoor attacks are not easily retrofitted to the unique vulnerability of SSL.

Related papers

Filter, Obstruct and Dilute: Defending Against Backdoor Attacks on Semi-Supervised Learning [29.65600202138321]
Recent studies have verified that semi-supervised learning (SSL) is vulnerable to data poisoning backdoor attacks. This work aims to protect SSL against such risks, marking it as one of the few known efforts in this area.
arXiv Detail & Related papers (2025-02-09T03:22:15Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
Invisible Backdoor Attack against Self-supervised Learning [31.813240503037132]
Self-supervised learning (SSL) models are vulnerable to backdoor attacks. This paper proposes an imperceptible and effective backdoor attack against self-supervised models.
arXiv Detail & Related papers (2024-05-23T15:08:31Z)
EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection [53.25863925815954]
Federated self-supervised learning (FSSL) has emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data. While FSSL offers advantages, its susceptibility to backdoor attacks has not been investigated. We propose the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models.
arXiv Detail & Related papers (2024-05-21T06:14:49Z)
Towards Adversarial Robustness And Backdoor Mitigation in SSL [0.562479170374811]
Self-Supervised Learning (SSL) has shown great promise in learning representations from unlabeled data. SSL methods have recently been shown to be vulnerable to backdoor attacks. This work aims to address defending against backdoor attacks in SSL.
arXiv Detail & Related papers (2024-03-23T19:21:31Z)
Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks. Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms. This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z)
Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking [65.44477004525231]
Researchers have recently found that Self-Supervised Learning (SSL) is vulnerable to backdoor attacks. In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method. Our method achieves 96% accuracy for backdoor trigger detection compared to 3% of the state-of-the-art method on poisoned ImageNet-100.
arXiv Detail & Related papers (2023-12-13T08:01:15Z)
Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation. Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them. We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z)
Targeted Forgetting and False Memory Formation in Continual Learners through Adversarial Backdoor Attacks [2.830541450812474]
We explore the vulnerability of Elastic Weight Consolidation (EWC), a popular continual learning algorithm for avoiding catastrophic forgetting. We show that an intelligent adversary can bypass the EWC's defenses, and instead cause gradual and deliberate forgetting by introducing small amounts of misinformation to the model during training. We demonstrate such an adversary's ability to assume control of the model via injection of "backdoor" attack samples on both permuted and split benchmark variants of the MNIST dataset.
arXiv Detail & Related papers (2020-02-17T18:13:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.