On the Difficulty of Defending Contrastive Learning against Backdoor
Attacks
- URL: http://arxiv.org/abs/2312.09057v1
- Date: Thu, 14 Dec 2023 15:54:52 GMT
- Title: On the Difficulty of Defending Contrastive Learning against Backdoor
Attacks
- Authors: Changjiang Li, Ren Pang, Bochuan Cao, Zhaohan Xi, Jinghui Chen,
Shouling Ji, Ting Wang
- Abstract summary: We show how contrastive backdoor attacks operate through distinctive mechanisms.
Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
- Score: 58.824074124014224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have shown that contrastive learning, like supervised
learning, is highly vulnerable to backdoor attacks wherein malicious functions
are injected into target models, only to be activated by specific triggers.
However, thus far it remains under-explored how contrastive backdoor attacks
fundamentally differ from their supervised counterparts, which impedes the
development of effective defenses against the emerging threat.
This work represents a solid step toward answering this critical question.
Specifically, we define TRL, a unified framework that encompasses both
supervised and contrastive backdoor attacks. Through the lens of TRL, we
uncover that the two types of attacks operate through distinctive mechanisms:
in supervised attacks, the learning of benign and backdoor tasks tends to occur
independently, while in contrastive attacks, the two tasks are deeply
intertwined both in their representations and throughout their learning
processes. This distinction leads to the disparate learning dynamics and
feature distributions of supervised and contrastive attacks. More importantly,
we reveal that the specificities of contrastive backdoor attacks entail
important implications from a defense perspective: existing defenses for
supervised attacks are often inadequate and not easily retrofitted to
contrastive attacks. We also explore several alternative defenses and discuss
their potential challenges. Our findings highlight the need for defenses
tailored to the specificities of contrastive backdoor attacks, pointing to
promising directions for future research.
Related papers
- Persistent Backdoor Attacks in Continual Learning [5.371962853011215]
We introduce two persistent backdoor attacks-Blind Task Backdoor and Latent Task Backdoor-each leveraging minimal adversarial influence.
Our results show that both attacks consistently achieve high success rates across different continual learning algorithms, while effectively evading state-of-the-art defenses.
arXiv Detail & Related papers (2024-09-20T19:28:48Z) - Non-Cooperative Backdoor Attacks in Federated Learning: A New Threat Landscape [7.00762739959285]
Federated Learning (FL) for privacy-preserving model training remains susceptible to backdoor attacks.
This research emphasizes the critical need for robust defenses against diverse backdoor attacks in the evolving FL landscape.
arXiv Detail & Related papers (2024-07-05T22:03:13Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Pre-trained Trojan Attacks for Visual Recognition [106.13792185398863]
Pre-trained vision models (PVMs) have become a dominant component due to their exceptional performance when fine-tuned for downstream tasks.
We propose the Pre-trained Trojan attack, which embeds backdoors into a PVM, enabling attacks across various downstream vision tasks.
We highlight the challenges posed by cross-task activation and shortcut connections in successful backdoor attacks.
arXiv Detail & Related papers (2023-12-23T05:51:40Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Contributor-Aware Defenses Against Adversarial Backdoor Attacks [2.830541450812474]
adversarial backdoor attacks have demonstrated the capability to perform targeted misclassification of specific examples.
We propose a contributor-aware universal defensive framework for learning in the presence of multiple, potentially adversarial data sources.
Our empirical studies demonstrate the robustness of the proposed framework against adversarial backdoor attacks from multiple simultaneous adversaries.
arXiv Detail & Related papers (2022-05-28T20:25:34Z) - On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
A backdoored model always predicts a target class in the presence of a predefined trigger pattern.
In general, adversarial training is believed to defend against backdoor attacks.
We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
arXiv Detail & Related papers (2022-02-22T02:24:46Z) - Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive
Review [40.36824357892676]
This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning.
According to the attacker's capability and affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide.
Countermeasures are categorized into four general classes: blind backdoor removal, offline backdoor inspection, online backdoor inspection, and post backdoor removal.
arXiv Detail & Related papers (2020-07-21T12:49:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.