Personalization as a Shortcut for Few-Shot Backdoor Attack against
Text-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2305.10701v3
- Date: Wed, 20 Dec 2023 05:52:41 GMT
- Title: Personalization as a Shortcut for Few-Shot Backdoor Attack against
Text-to-Image Diffusion Models
- Authors: Yihao Huang, Felix Juefei-Xu, Qing Guo, Jie Zhang, Yutong Wu, Ming Hu,
Tianlin Li, Geguang Pu, Yang Liu
- Abstract summary: This paper investigates the potential vulnerability of text-to-image (T2I) diffusion models to backdoor attacks via personalization.
Our study focuses on a zero-day backdoor vulnerability prevalent in two families of personalization methods, epitomized by Textual Inversion and DreamBooth.
By studying the prompt processing of Textual Inversion and DreamBooth, we have devised dedicated backdoor attacks according to the different ways of dealing with unseen tokens.
- Score: 23.695414399663235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although recent personalization methods have democratized high-resolution
image synthesis by enabling swift concept acquisition with minimal examples and
lightweight computation, they also present an exploitable avenue for high
accessible backdoor attacks. This paper investigates a critical and unexplored
aspect of text-to-image (T2I) diffusion models - their potential vulnerability
to backdoor attacks via personalization. Our study focuses on a zero-day
backdoor vulnerability prevalent in two families of personalization methods,
epitomized by Textual Inversion and DreamBooth.Compared to traditional backdoor
attacks, our proposed method can facilitate more precise, efficient, and easily
accessible attacks with a lower barrier to entry. We provide a comprehensive
review of personalization in T2I diffusion models, highlighting the operation
and exploitation potential of this backdoor vulnerability. To be specific, by
studying the prompt processing of Textual Inversion and DreamBooth, we have
devised dedicated backdoor attacks according to the different ways of dealing
with unseen tokens and analyzed the influence of triggers and concept images on
the attack effect. Through comprehensive empirical study, we endorse the
utilization of the nouveau-token backdoor attack due to its impressive
effectiveness, stealthiness, and integrity, markedly outperforming the
legacy-token backdoor attack.
Related papers
- Revisiting Backdoor Attacks against Large Vision-Language Models [76.42014292255944]
This paper empirically examines the generalizability of backdoor attacks during the instruction tuning of LVLMs.
We modify existing backdoor attacks based on the above key observations.
This paper underscores that even simple traditional backdoor strategies pose a serious threat to LVLMs.
arXiv Detail & Related papers (2024-06-27T02:31:03Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Attention-Enhancing Backdoor Attacks Against BERT-based Models [54.070555070629105]
Investigating the strategies of backdoor attacks will help to understand the model's vulnerability.
We propose a novel Trojan Attention Loss (TAL) which enhances the Trojan behavior by directly manipulating the attention patterns.
arXiv Detail & Related papers (2023-10-23T01:24:56Z) - Text-to-Image Diffusion Models can be Easily Backdoored through
Multimodal Data Poisoning [29.945013694922924]
We propose BadT2I, a general multimodal backdoor attack framework that tampers with image synthesis in diverse semantic levels.
Specifically, we perform backdoor attacks on three levels of the vision semantics: Pixel-Backdoor, Object-Backdoor and Style-Backdoor.
By utilizing a regularization loss, our methods efficiently inject backdoors into a large-scale text-to-image diffusion model.
arXiv Detail & Related papers (2023-05-07T03:21:28Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - Kallima: A Clean-label Framework for Textual Backdoor Attacks [25.332731545200808]
We propose the first clean-label framework Kallima for synthesizing mimesis-style backdoor samples.
We modify inputs belonging to the target class with adversarial perturbations, making the model rely more on the backdoor trigger.
arXiv Detail & Related papers (2022-06-03T21:44:43Z) - On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
A backdoored model always predicts a target class in the presence of a predefined trigger pattern.
In general, adversarial training is believed to defend against backdoor attacks.
We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
arXiv Detail & Related papers (2022-02-22T02:24:46Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.