X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
- URL: http://arxiv.org/abs/2505.05528v3
- Date: Thu, 29 May 2025 23:50:01 GMT
- Title: X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
- Authors: Hanxun Huang, Sarah Erfani, Yige Li, Xingjun Ma, James Bailey,
- Abstract summary: We introduce textbfX-Transfer, a novel attack method that exposes a universal adversarial vulnerability in CLIP.<n>X-Transfer generates a Universal Adversarial Perturbation capable of deceiving various CLIP encoders and downstream VLMs across different samples, tasks, and domains.
- Score: 32.85582585781569
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As Contrastive Language-Image Pre-training (CLIP) models are increasingly adopted for diverse downstream tasks and integrated into large vision-language models (VLMs), their susceptibility to adversarial perturbations has emerged as a critical concern. In this work, we introduce \textbf{X-Transfer}, a novel attack method that exposes a universal adversarial vulnerability in CLIP. X-Transfer generates a Universal Adversarial Perturbation (UAP) capable of deceiving various CLIP encoders and downstream VLMs across different samples, tasks, and domains. We refer to this property as \textbf{super transferability}--a single perturbation achieving cross-data, cross-domain, cross-model, and cross-task adversarial transferability simultaneously. This is achieved through \textbf{surrogate scaling}, a key innovation of our approach. Unlike existing methods that rely on fixed surrogate models, which are computationally intensive to scale, X-Transfer employs an efficient surrogate scaling strategy that dynamically selects a small subset of suitable surrogates from a large search space. Extensive evaluations demonstrate that X-Transfer significantly outperforms previous state-of-the-art UAP methods, establishing a new benchmark for adversarial transferability across CLIP models. The code is publicly available in our \href{https://github.com/HanxunH/XTransferBench}{GitHub repository}.
Related papers
- Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment [35.77916460821855]
Multimodal large language models (MLLMs) remain vulnerable to transferable adversarial attacks.<n>We propose a targeted transferable adversarial attack method based on feature optimal alignment.<n>Experiments demonstrate the superiority of the proposed method, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2025-05-27T17:56:57Z) - Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy [56.424032454461695]
We present Dita, a scalable framework that leverages Transformer architectures to directly denoise continuous action sequences.<n>Dita employs in-context conditioning -- enabling fine-grained alignment between denoised actions and raw visual tokens from historical observations.<n>Dita effectively integrates cross-embodiment datasets across diverse camera perspectives, observation scenes, tasks, and action spaces.
arXiv Detail & Related papers (2025-03-25T15:19:56Z) - Boosting the Local Invariance for Better Adversarial Transferability [4.75067406339309]
Transfer-based attacks pose a significant threat to real-world applications.<n>We propose a general adversarial transferability boosting technique called Local Invariance Boosting approach (LI-Boost)<n>Experiments on the standard ImageNet dataset demonstrate that LI-Boost could significantly boost various types of transfer-based attacks.
arXiv Detail & Related papers (2025-03-08T09:44:45Z) - Boosting Adversarial Transferability with Spatial Adversarial Alignment [30.343721474168635]
Deep neural networks are vulnerable to adversarial examples that exhibit transferability across various models.<n>We propose a technique that employs an alignment loss and leverages a witness model to fine-tune the surrogate model.<n>Experiments on various architectures on ImageNet show that aligned surrogate models based on SAA can provide higher transferable adversarial examples.
arXiv Detail & Related papers (2025-01-02T02:35:47Z) - Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks [42.18755809782401]
We propose a novel transfer attack method called PDCL-Attack.<n>We formulate an effective prompt-driven feature guidance by harnessing the semantic representation power of text.
arXiv Detail & Related papers (2024-07-30T08:52:16Z) - Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - LRS: Enhancing Adversarial Transferability through Lipschitz Regularized
Surrogate [8.248964912483912]
The transferability of adversarial examples is of central importance to transfer-based black-box adversarial attacks.
We propose Lipschitz Regularized Surrogate (LRS) for transfer-based black-box attacks.
We evaluate our proposed LRS approach by attacking state-of-the-art standard deep neural networks and defense models.
arXiv Detail & Related papers (2023-12-20T15:37:50Z) - Set-level Guidance Attack: Boosting Adversarial Transferability of
Vision-Language Pre-training Models [52.530286579915284]
We present the first study to investigate the adversarial transferability of vision-language pre-training models.
The transferability degradation is partly caused by the under-utilization of cross-modal interactions.
We propose a highly transferable Set-level Guidance Attack (SGA) that thoroughly leverages modality interactions and incorporates alignment-preserving augmentation with cross-modal guidance.
arXiv Detail & Related papers (2023-07-26T09:19:21Z) - Adversarial Pixel Restoration as a Pretext Task for Transferable
Perturbations [54.1807206010136]
Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models.
We propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch.
Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective.
arXiv Detail & Related papers (2022-07-18T17:59:58Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z) - Perturbing Across the Feature Hierarchy to Improve Standard and Strict
Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers.
We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance.
We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.