GhostEncoder: Stealthy Backdoor Attacks with Dynamic Triggers to
Pre-trained Encoders in Self-supervised Learning
- URL: http://arxiv.org/abs/2310.00626v1
- Date: Sun, 1 Oct 2023 09:39:27 GMT
- Title: GhostEncoder: Stealthy Backdoor Attacks with Dynamic Triggers to
Pre-trained Encoders in Self-supervised Learning
- Authors: Qiannan Wang, Changchun Yin, Zhe Liu, Liming Fang, Run Wang, Chenhao
Lin
- Abstract summary: Self-supervised learning (SSL) pertains to training pre-trained image encoders utilizing a substantial quantity of unlabeled images.
We propose GhostEncoder, the first dynamic invisible backdoor attack on SSL.
- Score: 15.314217530697928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Within the realm of computer vision, self-supervised learning (SSL) pertains
to training pre-trained image encoders utilizing a substantial quantity of
unlabeled images. Pre-trained image encoders can serve as feature extractors,
facilitating the construction of downstream classifiers for various tasks.
However, the use of SSL has led to an increase in security research related to
various backdoor attacks. Currently, the trigger patterns used in backdoor
attacks on SSL are mostly visible or static (sample-agnostic), making backdoors
less covert and significantly affecting the attack performance. In this work,
we propose GhostEncoder, the first dynamic invisible backdoor attack on SSL.
Unlike existing backdoor attacks on SSL, which use visible or static trigger
patterns, GhostEncoder utilizes image steganography techniques to encode hidden
information into benign images and generate backdoor samples. We then fine-tune
the pre-trained image encoder on a manipulation dataset to inject the backdoor,
enabling downstream classifiers built upon the backdoored encoder to inherit
the backdoor behavior for target downstream tasks. We evaluate GhostEncoder on
three downstream tasks and results demonstrate that GhostEncoder provides
practical stealthiness on images and deceives the victim model with a high
attack success rate without compromising its utility. Furthermore, GhostEncoder
withstands state-of-the-art defenses, including STRIP, STRIP-Cl, and
SSL-Cleanse.
Related papers
- DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders [6.698677477097004]
Self-supervised learning (SSL) is pervasively exploited in training high-quality upstream encoders with a large amount of unlabeled data.
backdoor attacks merely via polluting a small portion of training data.
We propose a novel detection mechanism, DeDe, which detects the activation of the backdoor mapping with the cooccurrence of victim encoder and trigger inputs.
arXiv Detail & Related papers (2024-11-25T07:26:22Z) - Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services [10.367966878807714]
Pre-trained encoders can be easily accessed online to build downstream machine learning (ML) services quickly.
This paper unveils a new vulnerability: the Pre-trained Inference (PEI) attack, which posts privacy threats toward encoders hidden behind downstream ML services.
arXiv Detail & Related papers (2024-08-05T20:27:54Z) - EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection [53.25863925815954]
Federated self-supervised learning (FSSL) has emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data.
While FSSL offers advantages, its susceptibility to backdoor attacks has not been investigated.
We propose the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models.
arXiv Detail & Related papers (2024-05-21T06:14:49Z) - Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking [65.44477004525231]
Researchers have recently found that Self-Supervised Learning (SSL) is vulnerable to backdoor attacks.
In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method.
Our method achieves 96% accuracy for backdoor trigger detection compared to 3% of the state-of-the-art method on poisoned ImageNet-100.
arXiv Detail & Related papers (2023-12-13T08:01:15Z) - Downstream-agnostic Adversarial Examples [66.8606539786026]
AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder.
Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels.
Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
arXiv Detail & Related papers (2023-07-23T10:16:47Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image
Encoders [23.2869445054295]
Self-supervised representation learning techniques encode images into rich features that are oblivious to downstream tasks.
The requirements for dedicated model designs and a massive amount of resources expose image encoders to the risks of potential model stealing attacks.
We propose Cont-Steal, a contrastive-learning-based attack, and validate its improved stealing effectiveness in various experiment settings.
arXiv Detail & Related papers (2022-01-19T10:27:28Z) - StolenEncoder: Stealing Pre-trained Encoders [62.02156378126672]
We propose the first attack called StolenEncoder to steal pre-trained image encoders.
Our results show that the encoders stolen by StolenEncoder have similar functionality with the target encoders.
arXiv Detail & Related papers (2022-01-15T17:04:38Z) - BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised
Learning [29.113263683850015]
Self-supervised learning in computer vision aims to pre-train an image encoder using a large amount of unlabeled images or (image, text) pairs.
We propose BadEncoder, the first backdoor attack to self-supervised learning.
arXiv Detail & Related papers (2021-08-01T02:22:31Z) - BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine
Learning Models [21.06679566096713]
We explore one of the most severe attacks against machine learning models, namely the backdoor attack, against both autoencoders and GANs.
The backdoor attack is a training time attack where the adversary implements a hidden backdoor in the target model that can only be activated by a secret trigger.
We extend the applicability of backdoor attacks to autoencoders and GAN-based models.
arXiv Detail & Related papers (2020-10-06T20:26:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.