Exploring Adversarial Attacks against Latent Diffusion Model from the
Perspective of Adversarial Transferability
- URL: http://arxiv.org/abs/2401.07087v1
- Date: Sat, 13 Jan 2024 14:34:18 GMT
- Title: Exploring Adversarial Attacks against Latent Diffusion Model from the
Perspective of Adversarial Transferability
- Authors: Junxi Chen, Junhao Dong, Xiaohua Xie
- Abstract summary: We investigate how the surrogate model's property influences the performance of adversarial examples (AEs) for latent diffusion models (LDMs)
We find that the smoothness of surrogate models at different time steps differs, and we substantially improve the performance of the MC-based AEs by selecting smoother surrogate models.
In the light of the theoretical framework on adversarial transferability in image classification, we also conduct a theoretical analysis to explain why smooth surrogate models can also boost AEs for LDMs.
- Score: 33.122737005468245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, many studies utilized adversarial examples (AEs) to raise the cost
of malicious image editing and copyright violation powered by latent diffusion
models (LDMs). Despite their successes, a few have studied the surrogate model
they used to generate AEs. In this paper, from the perspective of adversarial
transferability, we investigate how the surrogate model's property influences
the performance of AEs for LDMs. Specifically, we view the time-step sampling
in the Monte-Carlo-based (MC-based) adversarial attack as selecting surrogate
models. We find that the smoothness of surrogate models at different time steps
differs, and we substantially improve the performance of the MC-based AEs by
selecting smoother surrogate models. In the light of the theoretical framework
on adversarial transferability in image classification, we also conduct a
theoretical analysis to explain why smooth surrogate models can also boost AEs
for LDMs.
Related papers
- DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple and effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - Adversarial Examples are Misaligned in Diffusion Model Manifolds [7.979892202477701]
This study is dedicated to the investigation of adversarial attacks through the lens of diffusion models.
Our focus lies in utilizing the diffusion model to detect and analyze the anomalies introduced by these attacks on images.
Results demonstrate a notable capacity to discriminate effectively between benign and attacked images.
arXiv Detail & Related papers (2024-01-12T15:29:21Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - On the Robustness of Large Multimodal Models Against Image Adversarial
Attacks [81.2935966933355]
We study the impact of visual adversarial attacks on Large Multimodal Models (LMMs)
We find that in general LMMs are not robust to visual adversarial inputs.
We propose a new approach to real-world image classification which we term query decomposition.
arXiv Detail & Related papers (2023-12-06T04:59:56Z) - Latent Diffusion Counterfactual Explanations [28.574246724214962]
We introduce Latent Diffusion Counterfactual Explanations (LDCE)
LDCE harnesses the capabilities of recent class- or text-conditional foundation latent diffusion models to expedite counterfactual generation.
We show how LDCE can provide insights into model errors, enhancing our understanding of black-box model behavior.
arXiv Detail & Related papers (2023-10-10T14:42:34Z) - An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial
Transferability [26.39964737311377]
We propose an adaptive ensemble attack, dubbed AdaEA, to adaptively control the fusion of the outputs from each model.
We achieve considerable improvement over the existing ensemble attacks on various datasets.
arXiv Detail & Related papers (2023-08-05T15:12:36Z) - Why Does Little Robustness Help? Understanding and Improving Adversarial
Transferability from Surrogate Training [24.376314203167016]
Adversarial examples (AEs) for DNNs have been shown to be transferable.
In this paper, we take a further step towards understanding adversarial transferability.
arXiv Detail & Related papers (2023-07-15T19:20:49Z) - Reduce, Reuse, Recycle: Compositional Generation with Energy-Based
Diffusion Models and MCMC [106.06185677214353]
diffusion models have quickly become the prevailing approach to generative modeling in many domains.
We propose an energy-based parameterization of diffusion models which enables the use of new compositional operators.
We find these samplers lead to notable improvements in compositional generation across a wide set of problems.
arXiv Detail & Related papers (2023-02-22T18:48:46Z) - On the Transferability of Adversarial Examples between Encrypted Models [20.03508926499504]
We investigate the transferability of models encrypted for adversarially robust defense for the first time.
In an image-classification experiment, the use of encrypted models is confirmed not only to be robust against AEs but to also reduce the influence of AEs.
arXiv Detail & Related papers (2022-09-07T08:50:26Z) - Training Meta-Surrogate Model for Transferable Adversarial Attack [98.13178217557193]
We consider adversarial attacks to a black-box model when no queries are allowed.
In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model.
We show we can obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models.
arXiv Detail & Related papers (2021-09-05T03:27:46Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.