How to Backdoor Diffusion Models?
- URL: http://arxiv.org/abs/2212.05400v3
- Date: Fri, 9 Jun 2023 01:20:27 GMT
- Title: How to Backdoor Diffusion Models?
- Authors: Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho
- Abstract summary: This paper presents the first study on the robustness of diffusion models against backdoor attacks.
We propose BadDiffusion, a novel attack framework that engineers compromised diffusion processes during model training for backdoor implantation.
Our results call attention to potential risks and possible misuse of diffusion models.
- Score: 74.43215520371506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models are state-of-the-art deep learning empowered generative
models that are trained based on the principle of learning forward and reverse
diffusion processes via progressive noise-addition and denoising. To gain a
better understanding of the limitations and potential risks, this paper
presents the first study on the robustness of diffusion models against backdoor
attacks. Specifically, we propose BadDiffusion, a novel attack framework that
engineers compromised diffusion processes during model training for backdoor
implantation. At the inference stage, the backdoored diffusion model will
behave just like an untampered generator for regular data inputs, while falsely
generating some targeted outcome designed by the bad actor upon receiving the
implanted trigger signal. Such a critical risk can be dreadful for downstream
tasks and applications built upon the problematic model. Our extensive
experiments on various backdoor attack settings show that BadDiffusion can
consistently lead to compromised diffusion models with high utility and target
specificity. Even worse, BadDiffusion can be made cost-effective by simply
finetuning a clean pre-trained diffusion model to implant backdoors. We also
explore some possible countermeasures for risk mitigation. Our results call
attention to potential risks and possible misuse of diffusion models. Our code
is available on https://github.com/IBM/BadDiffusion.
Related papers
- TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors [36.07978634674072]
Diffusion models are vulnerable to backdoor attacks that compromise their integrity.
We propose TERD, a backdoor defense framework that builds unified modeling for current attacks.
TERD secures a 100% True Positive Rate (TPR) and True Negative Rate (TNR) across datasets of varying resolutions.
arXiv Detail & Related papers (2024-09-09T03:02:16Z) - Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models [65.30406788716104]
This work investigates the vulnerabilities of security-enhancing diffusion models.
We demonstrate that these models are highly susceptible to DIFF2, a simple yet effective backdoor attack.
Case studies show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models.
arXiv Detail & Related papers (2024-06-14T02:39:43Z) - Predicting Cascading Failures with a Hyperparametric Diffusion Model [66.89499978864741]
We study cascading failures in power grids through the lens of diffusion models.
Our model integrates viral diffusion principles with physics-based concepts.
We show that this diffusion model can be learned from traces of cascading failures.
arXiv Detail & Related papers (2024-06-12T02:34:24Z) - UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models [19.46962670935554]
Diffusion Models are vulnerable to backdoor attacks.
malicious attackers inject backdoors by poisoning some parts of the training samples.
This poses a serious threat to the downstream users, who query the diffusion models through the API or directly download them from the internet.
arXiv Detail & Related papers (2024-04-01T13:21:05Z) - The last Dance : Robust backdoor attack via diffusion models and bayesian approach [0.0]
Diffusion models are state-of-the-art deep learning generative models trained on the principle of learning forward and backward.
We demonstrate the feasibility of backdoor attacks on audio transformers derived from Hugging Face, a popular framework in the world of artificial intelligence research.
arXiv Detail & Related papers (2024-02-05T18:00:07Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - Leveraging Diffusion-Based Image Variations for Robust Training on
Poisoned Data [26.551317580666353]
Backdoor attacks pose a serious security threat for training neural networks.
We propose a novel approach that enables model training on potentially poisoned datasets by utilizing the power of recent diffusion models.
arXiv Detail & Related papers (2023-10-10T07:25:06Z) - Diffusion Models in Vision: A Survey [80.82832715884597]
A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage.
Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens.
arXiv Detail & Related papers (2022-09-10T22:00:30Z) - Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial
Auto-Encoders [137.1060633388405]
Diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain.
We propose a faster and cheaper approach that adds noise not until the data become pure random noise.
We show that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior.
arXiv Detail & Related papers (2022-02-19T20:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.