Improving Adversarial Attacks on Latent Diffusion Model
- URL: http://arxiv.org/abs/2310.04687v3
- Date: Wed, 6 Mar 2024 18:14:58 GMT
- Title: Improving Adversarial Attacks on Latent Diffusion Model
- Authors: Boyang Zheng, Chumeng Liang, Xiaoyu Wu, Yan Liu
- Abstract summary: Adrial attacks on Latent Diffusion Model (LDM) are effective protection against malicious finetuning of LDM on unauthorized images.
We show that these attacks add an extra error to the score function of adversarial examples predicted by LDM.
We propose to improve the adversarial attack on LDM by Attacking with Consistent score-function Errors.
- Score: 8.268827963476317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial attacks on Latent Diffusion Model (LDM), the state-of-the-art
image generative model, have been adopted as effective protection against
malicious finetuning of LDM on unauthorized images. We show that these attacks
add an extra error to the score function of adversarial examples predicted by
LDM. LDM finetuned on these adversarial examples learns to lower the error by a
bias, from which the model is attacked and predicts the score function with
biases.
Based on the dynamics, we propose to improve the adversarial attack on LDM by
Attacking with Consistent score-function Errors (ACE). ACE unifies the pattern
of the extra error added to the predicted score function. This induces the
finetuned LDM to learn the same pattern as a bias in predicting the score
function. We then introduce a well-crafted pattern to improve the attack. Our
method outperforms state-of-the-art methods in adversarial attacks on LDM.
Related papers
- MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable
Evasion Attacks [17.584752814352502]
Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data.
We introduce a self-supervised, computationally economical method for generating adversarial examples.
Our experiments consistently demonstrate the method is effective across various models, unseen data categories, and even defended models.
arXiv Detail & Related papers (2023-10-05T17:34:47Z) - Modeling Adversarial Attack on Pre-trained Language Models as Sequential
Decision Making [10.425483543802846]
adversarial attack task has found that pre-trained language models (PLMs) are vulnerable to small perturbations.
In this paper, we model the adversarial attack task on PLMs as a sequential decision-making problem.
We propose to use reinforcement learning to find an appropriate sequential attack path to generate adversaries, named SDM-Attack.
arXiv Detail & Related papers (2023-05-27T10:33:53Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Defending against the Label-flipping Attack in Federated Learning [5.769445676575767]
Federated learning (FL) provides autonomy and privacy by design to participating peers.
The label-flipping (LF) attack is a targeted poisoning attack where the attackers poison their training data by flipping the labels of some examples.
We propose a novel defense that first dynamically extracts those gradients from the peers' local updates.
arXiv Detail & Related papers (2022-07-05T12:02:54Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - Adversarial example generation with AdaBelief Optimizer and Crop
Invariance [8.404340557720436]
Adversarial attacks can be an important method to evaluate and select robust models in safety-critical applications.
We propose AdaBelief Iterative Fast Gradient Method (ABI-FGM) and Crop-Invariant attack Method (CIM) to improve the transferability of adversarial examples.
Our method has higher success rates than state-of-the-art gradient-based attack methods.
arXiv Detail & Related papers (2021-02-07T06:00:36Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.