One Prompt Word is Enough to Boost Adversarial Robustness for
Pre-trained Vision-Language Models
- URL: http://arxiv.org/abs/2403.01849v1
- Date: Mon, 4 Mar 2024 08:59:32 GMT
- Title: One Prompt Word is Enough to Boost Adversarial Robustness for
Pre-trained Vision-Language Models
- Authors: Lin Li, Haoyan Guan, Jianing Qiu, Michael Spratling
- Abstract summary: This work studies the adversarial robustness of Vision-Language Models (VLMs) from the novel perspective of the text prompt.
We propose a method to improve resilience to adversarial attacks by learning a robust text prompt for VLMs.
The proposed method, named Adversarial Prompt Tuning (APT), is effective while being both computationally and data efficient.
- Score: 7.308611036454601
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large pre-trained Vision-Language Models (VLMs) like CLIP, despite having
remarkable generalization ability, are highly vulnerable to adversarial
examples. This work studies the adversarial robustness of VLMs from the novel
perspective of the text prompt instead of the extensively studied model weights
(frozen in this work). We first show that the effectiveness of both adversarial
attack and defense are sensitive to the used text prompt. Inspired by this, we
propose a method to improve resilience to adversarial attacks by learning a
robust text prompt for VLMs. The proposed method, named Adversarial Prompt
Tuning (APT), is effective while being both computationally and data efficient.
Extensive experiments are conducted across 15 datasets and 4 data sparsity
schemes (from 1-shot to full training data settings) to show APT's superiority
over hand-engineered prompts and other state-of-the-art adaption methods. APT
demonstrated excellent abilities in terms of the in-distribution performance
and the generalization under input distribution shift and across datasets.
Surprisingly, by simply adding one learned word to the prompts, APT can
significantly boost the accuracy and robustness (epsilon=4/255) over the
hand-engineered prompts by +13% and +8.5% on average respectively. The
improvement further increases, in our most effective setting, to +26.4% for
accuracy and +16.7% for robustness. Code is available at
https://github.com/TreeLLi/APT.
Related papers
- Adversarial Prompt Distillation for Vision-Language Models [25.07001647341082]
Large pre-trained Vision-Language Models (VLMs) have been shown to be susceptible to adversarial attacks.
One promising approach for improving the robustness of pre-trained VLMs is Adversarial Prompt Tuning (APT)
We propose a novel method called Adversarial Prompt Distillation (APD) that combines APT with knowledge distillation to boost the adversarial robustness of CLIP.
arXiv Detail & Related papers (2024-11-22T03:02:13Z) - TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models [53.91006249339802]
We propose a novel defense method called Test-Time Adversarial Prompt Tuning (TAPT) to enhance the inference robustness of CLIP against visual adversarial attacks.
TAPT is a test-time defense method that learns defensive bimodal (textual and visual) prompts to robustify the inference process of CLIP.
We evaluate the effectiveness of TAPT on 11 benchmark datasets, including ImageNet and 10 other zero-shot datasets.
arXiv Detail & Related papers (2024-11-20T08:58:59Z) - Revisiting the Robust Generalization of Adversarial Prompt Tuning [4.033827046965844]
We propose an adaptive Consistency-guided Adrial Prompt Tuning (i.e., CAPT) framework to enhance the alignment of image and text features for adversarial examples.
We conduct experiments across 14 datasets and 4 data sparsity schemes to show the superiority of CAPT over other state-of-the-art adaption methods.
arXiv Detail & Related papers (2024-05-18T02:54:41Z) - Revisiting the Power of Prompt for Visual Tuning [50.11465784194896]
This study explores the correlation evolvement between prompts and patch tokens during proficient training.
Inspired by the observation that the prompt tokens tend to share high mutual information with patch tokens, we propose initializing prompts with downstream token prototypes.
Our method significantly advances the adaptation for self-supervised pretraining, achieving impressive task performance gains of at least 10% to 30%.
arXiv Detail & Related papers (2024-02-04T07:49:02Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by
Rewriting Text [40.491180210205556]
We present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial.
Our experiments reveal that ATINTER is effective at providing better adversarial robustness than existing defense approaches.
arXiv Detail & Related papers (2023-05-25T19:42:51Z) - Robustifying Sentiment Classification by Maximally Exploiting Few
Counterfactuals [16.731183915325584]
We propose a novel solution that only requires annotation of a small fraction of the original training data.
We achieve noticeable accuracy improvements by adding only 1% manual counterfactuals.
arXiv Detail & Related papers (2022-10-21T08:30:09Z) - Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language
Models [107.05966685291067]
We propose test-time prompt tuning (TPT) to learn adaptive prompts on the fly with a single test sample.
TPT improves the zero-shot top-1 accuracy of CLIP by 3.6% on average.
In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.
arXiv Detail & Related papers (2022-09-15T17:55:11Z) - PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation [89.0074567748505]
We propose a new metric to accurately predict the prompt transferability (regarding (i)), and a novel PoT approach (namely PANDA)
Our proposed metric works well to predict the prompt transferability; 2) our PANDA consistently outperforms the vanilla PoT approach by 2.3% average score (up to 24.1%) among all tasks and model sizes; 3) with our PANDA approach, prompt-tuning can achieve competitive and even better performance than model-tuning in various PLM scales scenarios.
arXiv Detail & Related papers (2022-08-22T09:14:14Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.