Pre-trained Trojan Attacks for Visual Recognition
- URL: http://arxiv.org/abs/2312.15172v1
- Date: Sat, 23 Dec 2023 05:51:40 GMT
- Title: Pre-trained Trojan Attacks for Visual Recognition
- Authors: Aishan Liu, Xinwei Zhang, Yisong Xiao, Yuguang Zhou, Siyuan Liang,
Jiakai Wang, Xianglong Liu, Xiaochun Cao, Dacheng Tao
- Abstract summary: Pre-trained vision models (PVMs) have become a dominant component due to their exceptional performance when fine-tuned for downstream tasks.
We propose the Pre-trained Trojan attack, which embeds backdoors into a PVM, enabling attacks across various downstream vision tasks.
We highlight the challenges posed by cross-task activation and shortcut connections in successful backdoor attacks.
- Score: 106.13792185398863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained vision models (PVMs) have become a dominant component due to
their exceptional performance when fine-tuned for downstream tasks. However,
the presence of backdoors within PVMs poses significant threats. Unfortunately,
existing studies primarily focus on backdooring PVMs for the classification
task, neglecting potential inherited backdoors in downstream tasks such as
detection and segmentation. In this paper, we propose the Pre-trained Trojan
attack, which embeds backdoors into a PVM, enabling attacks across various
downstream vision tasks. We highlight the challenges posed by cross-task
activation and shortcut connections in successful backdoor attacks. To achieve
effective trigger activation in diverse tasks, we stylize the backdoor trigger
patterns with class-specific textures, enhancing the recognition of
task-irrelevant low-level features associated with the target class in the
trigger pattern. Moreover, we address the issue of shortcut connections by
introducing a context-free learning pipeline for poison training. In this
approach, triggers without contextual backgrounds are directly utilized as
training data, diverging from the conventional use of clean images.
Consequently, we establish a direct shortcut from the trigger to the target
class, mitigating the shortcut connection issue. We conducted extensive
experiments to thoroughly validate the effectiveness of our attacks on
downstream detection and segmentation tasks. Additionally, we showcase the
potential of our approach in more practical scenarios, including large vision
models and 3D object detection in autonomous driving. This paper aims to raise
awareness of the potential threats associated with applying PVMs in practical
scenarios. Our codes will be available upon paper publication.
Related papers
- VL-Trojan: Multimodal Instruction Backdoor Attacks against
Autoregressive Visual Language Models [65.23688155159398]
Autoregressive Visual Language Models (VLMs) showcase impressive few-shot learning capabilities in a multimodal context.
Recently, multimodal instruction tuning has been proposed to further enhance instruction-following abilities.
Adversaries can implant a backdoor by injecting poisoned samples with triggers embedded in instructions or images.
We propose a multimodal instruction backdoor attack, namely VL-Trojan.
arXiv Detail & Related papers (2024-02-21T14:54:30Z) - On the Difficulty of Defending Contrastive Learning against Backdoor
Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms.
Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Poisoning Network Flow Classifiers [10.055241826257083]
This paper focuses on poisoning attacks, specifically backdoor attacks, against network traffic flow classifiers.
We investigate the challenging scenario of clean-label poisoning where the adversary's capabilities are constrained to tampering only with the training data.
We describe a trigger crafting strategy that leverages model interpretability techniques to generate trigger patterns that are effective even at very low poisoning rates.
arXiv Detail & Related papers (2023-06-02T16:24:15Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Backdoor Pre-trained Models Can Transfer to All [33.720258110911274]
We propose a new approach to map the inputs containing triggers directly to a predefined output representation of pre-trained NLP models.
In light of the unique properties of triggers in NLP, we propose two new metrics to measure the performance of backdoor attacks.
arXiv Detail & Related papers (2021-10-30T07:11:24Z) - Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial
Attacks for Online Visual Object Trackers [81.90113217334424]
We propose a framework to generate a single temporally transferable adversarial perturbation from the object template image only.
This perturbation can then be added to every search image, which comes at virtually no cost, and still, successfully fool the tracker.
arXiv Detail & Related papers (2020-12-30T15:05:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.