Typographic Attacks in a Multi-Image Setting
- URL: http://arxiv.org/abs/2502.08193v1
- Date: Wed, 12 Feb 2025 08:10:25 GMT
- Title: Typographic Attacks in a Multi-Image Setting
- Authors: Xiaomeng Wang, Zhengyu Zhao, Martha Larson,
- Abstract summary: We introduce a multi-image setting for studying typographic attacks.
Specifically, our focus is on attacking image sets without repeating the attack query.
We introduce two attack strategies for the multi-image setting, leveraging the difficulty of the target image, the strength of the attack text, and text-image similarity.
- Score: 2.9154316123656927
- License:
- Abstract: Large Vision-Language Models (LVLMs) are susceptible to typographic attacks, which are misclassifications caused by an attack text that is added to an image. In this paper, we introduce a multi-image setting for studying typographic attacks, broadening the current emphasis of the literature on attacking individual images. Specifically, our focus is on attacking image sets without repeating the attack query. Such non-repeating attacks are stealthier, as they are more likely to evade a gatekeeper than attacks that repeat the same attack text. We introduce two attack strategies for the multi-image setting, leveraging the difficulty of the target image, the strength of the attack text, and text-image similarity. Our text-image similarity approach improves attack success rates by 21% over random, non-specific methods on the CLIP model using ImageNet while maintaining stealth in a multi-image scenario. An additional experiment demonstrates transferability, i.e., text-image similarity calculated using CLIP transfers when attacking InstructBLIP.
Related papers
- White-box Multimodal Jailbreaks Against Large Vision-Language Models [61.97578116584653]
We propose a more comprehensive strategy that jointly attacks both text and image modalities to exploit a broader spectrum of vulnerability within Large Vision-Language Models.
Our attack method begins by optimizing an adversarial image prefix from random noise to generate diverse harmful responses in the absence of text input.
An adversarial text suffix is integrated and co-optimized with the adversarial image prefix to maximize the probability of eliciting affirmative responses to various harmful instructions.
arXiv Detail & Related papers (2024-05-28T07:13:30Z) - Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective [42.04728834962863]
Pretrained vision-language models (VLMs) like CLIP exhibit exceptional generalization across diverse downstream tasks.
Recent studies reveal their vulnerability to adversarial attacks, with defenses against text-based and multimodal attacks remaining largely unexplored.
This work presents the first comprehensive study on improving the adversarial robustness of VLMs against attacks targeting image, text, and multimodal inputs.
arXiv Detail & Related papers (2024-04-30T06:34:21Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Impart: An Imperceptible and Effective Label-Specific Backdoor Attack [15.859650783567103]
We propose a novel imperceptible backdoor attack framework, named Impart, in the scenario where the attacker has no access to the victim model.
Specifically, in order to enhance the attack capability of the all-to-all setting, we first propose a label-specific attack.
arXiv Detail & Related papers (2024-03-18T07:22:56Z) - VQAttack: Transferable Adversarial Attacks on Visual Question Answering
via Pre-trained Models [58.21452697997078]
We propose a novel VQAttack model, which can generate both image and text perturbations with the designed modules.
Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQAttack.
arXiv Detail & Related papers (2024-02-16T21:17:42Z) - Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks [58.10730906004818]
Typographic attacks, adding misleading text to images, can deceive vision-language models (LVLMs)
Our experiments show these attacks significantly reduce classification performance by up to 60%.
arXiv Detail & Related papers (2024-02-01T14:41:20Z) - GAMA: Generative Adversarial Multi-Object Scene Attacks [48.33120361498787]
This paper presents the first approach of using generative models for adversarial attacks on multi-object scenes.
We call this attack approach Generative Adversarial Multi-object scene Attacks (GAMA)
arXiv Detail & Related papers (2022-09-20T06:40:54Z) - QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval [56.51916317628536]
We study the query-based attack against image retrieval to evaluate its robustness against adversarial examples under the black-box setting.
A new relevance-based loss is designed to quantify the attack effects by measuring the set similarity on the top-k retrieval results before and after attacks.
Experiments show that the proposed attack achieves a high attack success rate with few queries against the image retrieval systems under the black-box setting.
arXiv Detail & Related papers (2021-03-04T10:18:43Z) - Learning to Attack with Fewer Pixels: A Probabilistic Post-hoc Framework
for Refining Arbitrary Dense Adversarial Attacks [21.349059923635515]
adversarial evasion attacks are reported to be susceptible to deep neural network image classifiers.
We propose a probabilistic post-hoc framework that refines given dense attacks by significantly reducing the number of perturbed pixels.
Our framework performs adversarial attacks much faster than existing sparse attacks.
arXiv Detail & Related papers (2020-10-13T02:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.