CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive
Learning
- URL: http://arxiv.org/abs/2211.08229v5
- Date: Thu, 29 Feb 2024 21:26:45 GMT
- Title: CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive
Learning
- Authors: Jinghuai Zhang and Hongbin Liu and Jinyuan Jia and Neil Zhenqiang Gong
- Abstract summary: Contrastive learning pre-trains general-purpose encoders using an unlabeled pre-training dataset.
DPBAs inject poisoned inputs into the pre-training dataset so the encoder is backdoored.
CorruptEncoder introduces a new attack strategy to create poisoned inputs and uses a theory-guided method to maximize attack effectiveness.
Our results show that our defense can reduce the effectiveness of DPBAs, but it sacrifices the utility of the encoder, highlighting the need for new defenses.
- Score: 71.25518220297639
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning (CL) pre-trains general-purpose encoders using an
unlabeled pre-training dataset, which consists of images or image-text pairs.
CL is vulnerable to data poisoning based backdoor attacks (DPBAs), in which an
attacker injects poisoned inputs into the pre-training dataset so the encoder
is backdoored. However, existing DPBAs achieve limited effectiveness. In this
work, we take the first step to analyze the limitations of existing backdoor
attacks and propose new DPBAs called CorruptEncoder to CL. CorruptEncoder
introduces a new attack strategy to create poisoned inputs and uses a
theory-guided method to maximize attack effectiveness. Our experiments show
that CorruptEncoder substantially outperforms existing DPBAs. In particular,
CorruptEncoder is the first DPBA that achieves more than 90% attack success
rates with only a few (3) reference images and a small poisoning ratio 0.5%.
Moreover, we also propose a defense, called localized cropping, to defend
against DPBAs. Our results show that our defense can reduce the effectiveness
of DPBAs, but it sacrifices the utility of the encoder, highlighting the need
for new defenses.
Related papers
- SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers [8.15496105932744]
Poisoning-based backdoor attacks expose vulnerabilities in the data preparation stage of deep neural network (DNN) training.
We develop a new categorization of triggers inspired by the adversarial technique and develop a multi-label and multi-payload Poisoning-based backdoor attack with Positive Triggers (PPT)
Under both dirty- and clean-label settings, we show empirically that the proposed attack achieves a high attack success rate without sacrificing accuracy across various datasets.
arXiv Detail & Related papers (2024-05-09T06:45:11Z) - UltraClean: A Simple Framework to Train Robust Neural Networks against Backdoor Attacks [19.369701116838776]
Backdoor attacks are emerging threats to deep neural networks.
They typically embed malicious behaviors into a victim model by injecting poisoned samples.
We propose UltraClean, a framework that simplifies the identification of poisoned samples.
arXiv Detail & Related papers (2023-12-17T09:16:17Z) - Does Differential Privacy Prevent Backdoor Attacks in Practice? [8.951356689083166]
We investigate the effectiveness of Differential Privacy techniques in preventing backdoor attacks in machine learning models.
We propose Label-DP as a faster and more accurate alternative to DP-SGD and PATE.
arXiv Detail & Related papers (2023-11-10T18:32:08Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in
Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning.
In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data.
We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z) - Model-Contrastive Learning for Backdoor Defense [13.781375023320981]
We propose a novel backdoor defense method named MCL based on model-contrastive learning.
MCL is more effective for reducing backdoor threats while maintaining higher accuracy of benign data.
arXiv Detail & Related papers (2022-05-09T16:36:46Z) - Backdoor Attack on Hash-based Image Retrieval via Clean-label Data
Poisoning [54.15013757920703]
We propose the confusing perturbations-induced backdoor attack (CIBA)
It injects a small number of poisoned images with the correct label into the training data.
We have conducted extensive experiments to verify the effectiveness of our proposed CIBA.
arXiv Detail & Related papers (2021-09-18T07:56:59Z) - DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with
Differentially Private Data Augmentations [54.960853673256]
We show that strong data augmentations, such as mixup and random additive noise, nullify poison attacks while enduring only a small accuracy trade-off.
A rigorous analysis of DP-InstaHide shows that mixup does indeed have privacy advantages, and that training with k-way mixup provably yields at least k times stronger DP guarantees than a naive DP mechanism.
arXiv Detail & Related papers (2021-03-02T23:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.