BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised
Learning
- URL: http://arxiv.org/abs/2108.00352v1
- Date: Sun, 1 Aug 2021 02:22:31 GMT
- Title: BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised
Learning
- Authors: Jinyuan Jia and Yupei Liu and Neil Zhenqiang Gong
- Abstract summary: Self-supervised learning in computer vision aims to pre-train an image encoder using a large amount of unlabeled images or (image, text) pairs.
We propose BadEncoder, the first backdoor attack to self-supervised learning.
- Score: 29.113263683850015
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning in computer vision aims to pre-train an image
encoder using a large amount of unlabeled images or (image, text) pairs. The
pre-trained image encoder can then be used as a feature extractor to build
downstream classifiers for many downstream tasks with a small amount of or no
labeled training data. In this work, we propose BadEncoder, the first backdoor
attack to self-supervised learning. In particular, our BadEncoder injects
backdoors into a pre-trained image encoder such that the downstream classifiers
built based on the backdoored image encoder for different downstream tasks
simultaneously inherit the backdoor behavior. We formulate our BadEncoder as an
optimization problem and we propose a gradient descent based method to solve
it, which produces a backdoored image encoder from a clean one. Our extensive
empirical evaluation results on multiple datasets show that our BadEncoder
achieves high attack success rates while preserving the accuracy of the
downstream classifiers. We also show the effectiveness of BadEncoder using two
publicly available, real-world image encoders, i.e., Google's image encoder
pre-trained on ImageNet and OpenAI's Contrastive Language-Image Pre-training
(CLIP) image encoder pre-trained on 400 million (image, text) pairs collected
from the Internet. Moreover, we consider defenses including Neural Cleanse and
MNTD (empirical defenses) as well as PatchGuard (a provable defense). Our
results show that these defenses are insufficient to defend against BadEncoder,
highlighting the needs for new defenses against our BadEncoder. Our code is
publicly available at: https://github.com/jjy1994/BadEncoder.
Related papers
- GhostEncoder: Stealthy Backdoor Attacks with Dynamic Triggers to
Pre-trained Encoders in Self-supervised Learning [15.314217530697928]
Self-supervised learning (SSL) pertains to training pre-trained image encoders utilizing a substantial quantity of unlabeled images.
We propose GhostEncoder, the first dynamic invisible backdoor attack on SSL.
arXiv Detail & Related papers (2023-10-01T09:39:27Z) - Downstream-agnostic Adversarial Examples [66.8606539786026]
AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder.
Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels.
Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
arXiv Detail & Related papers (2023-07-23T10:16:47Z) - Human-imperceptible, Machine-recognizable Images [76.01951148048603]
A major conflict is exposed relating to software engineers between better developing AI systems and distancing from the sensitive training data.
This paper proposes an efficient privacy-preserving learning paradigm, where images are encrypted to become human-imperceptible, machine-recognizable''
We show that the proposed paradigm can ensure the encrypted images have become human-imperceptible while preserving machine-recognizable information.
arXiv Detail & Related papers (2023-06-06T13:41:37Z) - Detecting Backdoors in Pre-trained Encoders [25.105186092387633]
We propose DECREE, the first backdoor detection approach for pre-trained encoders.
We show the effectiveness of our method on image encoders pre-trained on ImageNet and OpenAI's CLIP 400 million image-text pairs.
arXiv Detail & Related papers (2023-03-23T19:04:40Z) - CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive
Learning [71.25518220297639]
Contrastive learning pre-trains general-purpose encoders using an unlabeled pre-training dataset.
DPBAs inject poisoned inputs into the pre-training dataset so the encoder is backdoored.
CorruptEncoder introduces a new attack strategy to create poisoned inputs and uses a theory-guided method to maximize attack effectiveness.
Our results show that our defense can reduce the effectiveness of DPBAs, but it sacrifices the utility of the encoder, highlighting the need for new defenses.
arXiv Detail & Related papers (2022-11-15T15:48:28Z) - PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in
Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning.
In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data.
We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z) - StolenEncoder: Stealing Pre-trained Encoders [62.02156378126672]
We propose the first attack called StolenEncoder to steal pre-trained image encoders.
Our results show that the encoders stolen by StolenEncoder have similar functionality with the target encoders.
arXiv Detail & Related papers (2022-01-15T17:04:38Z) - Masked Autoencoders Are Scalable Vision Learners [60.97703494764904]
Masked autoencoders (MAE) are scalable self-supervised learners for computer vision.
Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.
Coupling these two designs enables us to train large models efficiently and effectively.
arXiv Detail & Related papers (2021-11-11T18:46:40Z) - EncoderMI: Membership Inference against Pre-trained Encoders in
Contrastive Learning [27.54202989524394]
We proposeMI, the first membership inference method against image encoders pre-trained by contrastive learning.
We evaluateMI on image encoders pre-trained on multiple datasets by ourselves as well as the Contrastive Language-Image Pre-training (CLIP) image encoder, which is pre-trained on 400 million (image, text) pairs collected from the Internet and released by OpenAI.
arXiv Detail & Related papers (2021-08-25T03:00:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.