Asymmetric Bias in Text-to-Image Generation with Adversarial Attacks
- URL: http://arxiv.org/abs/2312.14440v3
- Date: Wed, 17 Jul 2024 04:07:57 GMT
- Title: Asymmetric Bias in Text-to-Image Generation with Adversarial Attacks
- Authors: Haz Sameen Shahgir, Xianghao Kong, Greg Ver Steeg, Yue Dong,
- Abstract summary: This paper focuses on analyzing factors associated with attack success rates (ASR)
We introduce a new attack objective - entity swapping using adversarial suffixes and two gradient-based attack algorithms.
We identify conditions that result in a success probability of 60% for adversarial attacks and others where this likelihood drops below 5%.
- Score: 21.914674640285337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The widespread use of Text-to-Image (T2I) models in content generation requires careful examination of their safety, including their robustness to adversarial attacks. Despite extensive research on adversarial attacks, the reasons for their effectiveness remain underexplored. This paper presents an empirical study on adversarial attacks against T2I models, focusing on analyzing factors associated with attack success rates (ASR). We introduce a new attack objective - entity swapping using adversarial suffixes and two gradient-based attack algorithms. Human and automatic evaluations reveal the asymmetric nature of ASRs on entity swap: for example, it is easier to replace "human" with "robot" in the prompt "a human dancing in the rain." with an adversarial suffix, but the reverse replacement is significantly harder. We further propose probing metrics to establish indicative signals from the model's beliefs to the adversarial ASR. We identify conditions that result in a success probability of 60% for adversarial attacks and others where this likelihood drops below 5%.
Related papers
- Improving behavior based authentication against adversarial attack using XAI [3.340314613771868]
We propose an eXplainable AI (XAI) based defense strategy against adversarial attacks in such scenarios.
A feature selector, trained with our method, can be used as a filter in front of the original authenticator.
We demonstrate that our XAI based defense strategy is effective against adversarial attacks and outperforms other defense strategies.
arXiv Detail & Related papers (2024-02-26T09:29:05Z) - Susceptibility of Adversarial Attack on Medical Image Segmentation
Models [0.0]
We investigate the effect of adversarial attacks on segmentation models trained on MRI datasets.
We find that medical imaging segmentation models are indeed vulnerable to adversarial attacks.
We show that using a different loss function than the one used for training yields higher adversarial attack success.
arXiv Detail & Related papers (2024-01-20T12:52:20Z) - Rethinking Textual Adversarial Defense for Pre-trained Language Models [79.18455635071817]
A literature review shows that pre-trained language models (PrLMs) are vulnerable to adversarial attacks.
We propose a novel metric (Degree of Anomaly) to enable current adversarial attack approaches to generate more natural and imperceptible adversarial examples.
We show that our universal defense framework achieves comparable or even higher after-attack accuracy with other specific defenses.
arXiv Detail & Related papers (2022-07-21T07:51:45Z) - Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers.
Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods.
Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z) - Hide and Seek: on the Stealthiness of Attacks against Deep Learning
Systems [15.733167372239432]
We present the first large-scale study on the stealthiness of adversarial samples used in the attacks against deep learning.
We have implemented 20 representative adversarial ML attacks on six popular benchmarking datasets.
Our results show that the majority of the existing attacks introduce nonnegligible perturbations that are not stealthy to human eyes.
arXiv Detail & Related papers (2022-05-31T16:43:22Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z) - Temporal Sparse Adversarial Attack on Sequence-based Gait Recognition [56.844587127848854]
We demonstrate that the state-of-the-art gait recognition model is vulnerable to such attacks.
We employ a generative adversarial network based architecture to semantically generate adversarial high-quality gait silhouettes or video frames.
The experimental results show that if only one-fortieth of the frames are attacked, the accuracy of the target model drops dramatically.
arXiv Detail & Related papers (2020-02-22T10:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.