Towards Adversarial Attack on Vision-Language Pre-training Models
- URL: http://arxiv.org/abs/2206.09391v1
- Date: Sun, 19 Jun 2022 12:55:45 GMT
- Title: Towards Adversarial Attack on Vision-Language Pre-training Models
- Authors: Jiaming Zhang, Qi Yi, Jitao Sang
- Abstract summary: This paper studied the adversarial attack on popular vision-language (V+L) models and V+L tasks.
By examining the influence of different objects and attack targets, we concluded some key observations as guidance on designing strong multimodal adversarial attack.
- Score: 15.882687207499373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While vision-language pre-training model (VLP) has shown revolutionary
improvements on various vision-language (V+L) tasks, the studies regarding its
adversarial robustness remain largely unexplored. This paper studied the
adversarial attack on popular VLP models and V+L tasks. First, we analyzed the
performance of adversarial attacks under different settings. By examining the
influence of different perturbed objects and attack targets, we concluded some
key observations as guidance on both designing strong multimodal adversarial
attack and constructing robust VLP models. Second, we proposed a novel
multimodal attack method on the VLP models called Collaborative Multimodal
Adversarial Attack (Co-Attack), which collectively carries out the attacks on
the image modality and the text modality. Experimental results demonstrated
that the proposed method achieves improved attack performances on different V+L
downstream tasks and VLP models. The analysis observations and novel attack
method hopefully provide new understanding into the adversarial robustness of
VLP models, so as to contribute their safe and reliable deployment in more
real-world scenarios.
Related papers
- Probing the Robustness of Vision-Language Pretrained Models: A Multimodal Adversarial Attack Approach [30.9778838504609]
Vision-language pretraining with transformers has demonstrated exceptional performance across numerous multimodal tasks.
Existing multimodal attack methods have largely overlooked cross-modal interactions between visual and textual modalities.
We propose a novel Joint Multimodal Transformer Feature Attack (JMTFA) that concurrently introduces adversarial perturbations in both visual and textual modalities.
arXiv Detail & Related papers (2024-08-24T04:31:37Z) - A Unified Understanding of Adversarial Vulnerability Regarding Unimodal Models and Vision-Language Pre-training Models [7.350203999073509]
Feature Guidance Attack (FGA) is a novel method that uses text representations to direct the perturbation of clean images.
Our method demonstrates stable and effective attack capabilities across various datasets, downstream tasks, and both black-box and white-box settings.
arXiv Detail & Related papers (2024-07-25T06:10:33Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - VL-Trojan: Multimodal Instruction Backdoor Attacks against
Autoregressive Visual Language Models [65.23688155159398]
Autoregressive Visual Language Models (VLMs) showcase impressive few-shot learning capabilities in a multimodal context.
Recently, multimodal instruction tuning has been proposed to further enhance instruction-following abilities.
Adversaries can implant a backdoor by injecting poisoned samples with triggers embedded in instructions or images.
We propose a multimodal instruction backdoor attack, namely VL-Trojan.
arXiv Detail & Related papers (2024-02-21T14:54:30Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - On the Robustness of Large Multimodal Models Against Image Adversarial
Attacks [81.2935966933355]
We study the impact of visual adversarial attacks on Large Multimodal Models (LMMs)
We find that in general LMMs are not robust to visual adversarial inputs.
We propose a new approach to real-world image classification which we term query decomposition.
arXiv Detail & Related papers (2023-12-06T04:59:56Z) - Set-level Guidance Attack: Boosting Adversarial Transferability of
Vision-Language Pre-training Models [52.530286579915284]
We present the first study to investigate the adversarial transferability of vision-language pre-training models.
The transferability degradation is partly caused by the under-utilization of cross-modal interactions.
We propose a highly transferable Set-level Guidance Attack (SGA) that thoroughly leverages modality interactions and incorporates alignment-preserving augmentation with cross-modal guidance.
arXiv Detail & Related papers (2023-07-26T09:19:21Z) - Visual Adversarial Examples Jailbreak Aligned Large Language Models [66.53468356460365]
We show that the continuous and high-dimensional nature of the visual input makes it a weak link against adversarial attacks.
We exploit visual adversarial examples to circumvent the safety guardrail of aligned LLMs with integrated vision.
Our study underscores the escalating adversarial risks associated with the pursuit of multimodality.
arXiv Detail & Related papers (2023-06-22T22:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.