BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning
- URL: http://arxiv.org/abs/2311.12075v3
- Date: Mon, 4 Mar 2024 13:59:22 GMT
- Title: BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning
- Authors: Siyuan Liang, Mingli Zhu, Aishan Liu, Baoyuan Wu, Xiaochun Cao,
Ee-Chien Chang
- Abstract summary: This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
- Score: 85.2564206440109
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Studying backdoor attacks is valuable for model copyright protection and
enhancing defenses. While existing backdoor attacks have successfully infected
multimodal contrastive learning models such as CLIP, they can be easily
countered by specialized backdoor defenses for MCL models. This paper reveals
the threats in this practical scenario that backdoor attacks can remain
effective even after defenses and introduces the \emph{\toolns} attack, which
is resistant to backdoor detection and model fine-tuning defenses. To achieve
this, we draw motivations from the perspective of the Bayesian rule and propose
a dual-embedding guided framework for backdoor attacks. Specifically, we ensure
that visual trigger patterns approximate the textual target semantics in the
embedding space, making it challenging to detect the subtle parameter
variations induced by backdoor learning on such natural trigger patterns.
Additionally, we optimize the visual trigger patterns to align the poisoned
samples with target vision features in order to hinder the backdoor unlearning
through clean fine-tuning. Extensive experiments demonstrate that our attack
significantly outperforms state-of-the-art baselines (+45.3% ASR) in the
presence of SoTA backdoor defenses, rendering these mitigation and detection
strategies virtually ineffective. Furthermore, our approach effectively attacks
some more rigorous scenarios like downstream tasks. We believe that this paper
raises awareness regarding the potential threats associated with the practical
application of multimodal contrastive learning and encourages the development
of more robust defense mechanisms.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Revisiting Backdoor Attacks against Large Vision-Language Models [76.42014292255944]
This paper empirically examines the generalizability of backdoor attacks during the instruction tuning of LVLMs.
We modify existing backdoor attacks based on the above key observations.
This paper underscores that even simple traditional backdoor strategies pose a serious threat to LVLMs.
arXiv Detail & Related papers (2024-06-27T02:31:03Z) - Towards Unified Robustness Against Both Backdoor and Adversarial Attacks [31.846262387360767]
Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks.
This paper reveals that there is an intriguing connection between backdoor and adversarial attacks.
A novel Progressive Unified Defense algorithm is proposed to defend against backdoor and adversarial attacks simultaneously.
arXiv Detail & Related papers (2024-05-28T07:50:00Z) - Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack [32.74007523929888]
We re-investigate the characteristics of backdoored models after defense.
We find that the original backdoors still exist in defense models derived from existing post-training defense strategies.
We empirically show that these dormant backdoors can be easily re-activated during inference.
arXiv Detail & Related papers (2024-05-25T08:57:30Z) - On the Difficulty of Defending Contrastive Learning against Backdoor
Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms.
Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Contributor-Aware Defenses Against Adversarial Backdoor Attacks [2.830541450812474]
adversarial backdoor attacks have demonstrated the capability to perform targeted misclassification of specific examples.
We propose a contributor-aware universal defensive framework for learning in the presence of multiple, potentially adversarial data sources.
Our empirical studies demonstrate the robustness of the proposed framework against adversarial backdoor attacks from multiple simultaneous adversaries.
arXiv Detail & Related papers (2022-05-28T20:25:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.