Related papers: On the Adversarial Transferability of Generalized "Skip Connections"

On the Adversarial Transferability of Generalized "Skip Connections"

URL: http://arxiv.org/abs/2410.08950v1
Date: Fri, 11 Oct 2024 16:17:47 GMT
Title: On the Adversarial Transferability of Generalized "Skip Connections"
Authors: Yisen Wang, Yichuan Mo, Dongxian Wu, Mingjie Li, Xingjun Ma, Zhouchen Lin,
Abstract summary: Skip connection is an essential ingredient for modern deep models to be deeper and more powerful. We find that using more gradients from the skip connections rather than the residual modules during backpropagation allows one to craft adversarial examples with high transferability. We conduct comprehensive transfer attacks against various models including ResNets, Transformers, Inceptions, Neural Architecture Search, and Large Language Models.
Score: 83.71752155227888
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Skip connection is an essential ingredient for modern deep models to be deeper and more powerful. Despite their huge success in normal scenarios (state-of-the-art classification performance on natural examples), we investigate and identify an interesting property of skip connections under adversarial scenarios, namely, the use of skip connections allows easier generation of highly transferable adversarial examples. Specifically, in ResNet-like models (with skip connections), we find that using more gradients from the skip connections rather than the residual modules according to a decay factor during backpropagation allows one to craft adversarial examples with high transferability. The above method is termed as Skip Gradient Method (SGM). Although starting from ResNet-like models in vision domains, we further extend SGM to more advanced architectures, including Vision Transformers (ViTs) and models with length-varying paths and other domains, i.e. natural language processing. We conduct comprehensive transfer attacks against various models including ResNets, Transformers, Inceptions, Neural Architecture Search, and Large Language Models (LLMs). We show that employing SGM can greatly improve the transferability of crafted attacks in almost all cases. Furthermore, considering the big complexity for practical use, we further demonstrate that SGM can even improve the transferability on ensembles of models or targeted attacks and the stealthiness against current defenses. At last, we provide theoretical explanations and empirical insights on how SGM works. Our findings not only motivate new adversarial research into the architectural characteristics of models but also open up further challenges for secure model architecture design. Our code is available at https://github.com/mo666666/SGM.

Related papers

Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
Activation Space Interventions Can Be Transferred Between Large Language Models [0.0]
We show that safety interventions can be transferred between models through learned mappings of their shared activation spaces. We demonstrate this approach on two well-established AI safety tasks: backdoor removal and refusal of harmful prompts. We also propose a new task, textitcorrupted capabilities, where models are fine-tuned to embed knowledge tied to a backdoor.
arXiv Detail & Related papers (2025-03-06T13:38:44Z)
PB-UAP: Hybrid Universal Adversarial Attack For Image Segmentation [15.702469692874816]
We propose a novel universal adversarial attack method designed for segmentation models.<n>Our method achieves high attack success rates surpassing the state-of-the-art methods, and exhibits strong transferability across different models.
arXiv Detail & Related papers (2024-12-21T14:46:01Z)
Towards Adversarial Robustness of Model-Level Mixture-of-Experts Architectures for Semantic Segmentation [11.311414617703308]
We evaluate the adversarial vulnerability of MoEs for semantic segmentation of urban and highway traffic scenes. We show that MoEs are, in most cases, more robust to per-instance and universal white-box adversarial attacks and can better withstand transfer attacks.
arXiv Detail & Related papers (2024-12-16T09:49:59Z)
Scaling Laws for Black box Adversarial Attacks [37.744814957775965]
Adversarial examples exhibit cross-model transferability, enabling to attack black-box models. Model ensembling is an effective strategy to improve the transferability by attacking multiple surrogate models simultaneously. We show that scaled attacks bring better interpretability in semantics, indicating that the common features of models are captured.
arXiv Detail & Related papers (2024-11-25T08:14:37Z)
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks [42.18755809782401]
We propose a novel transfer attack method called PDCL-Attack. We formulate an effective prompt-driven feature guidance by harnessing the semantic representation power of text.
arXiv Detail & Related papers (2024-07-30T08:52:16Z)
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities [72.68829963458408]
We present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage.
arXiv Detail & Related papers (2024-04-20T08:34:39Z)
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts [88.23732496104667]
Cross-scene generalizable NeRF models have become a new spotlight of the NeRF field. We bridge "neuralized" architectures with the powerful Mixture-of-Experts (MoE) idea from large language models. Our proposed model, dubbed GNT with Mixture-of-View-Experts (GNT-MOVE), has experimentally shown state-of-the-art results when transferring to unseen scenes.
arXiv Detail & Related papers (2023-08-22T21:18:54Z)
Enhance transferability of adversarial examples with model architecture [29.340413471204478]
Transferability of adversarial examples is of critical importance to launch black-box adversarial attacks. In this paper, we suggest alleviating the overfitting issue from a novel perspective, i.e., designing a fitted model architecture. We show that the transferability of adversarial examples based on the MMA significantly surpass other state-of-the-art model architectures by up to 40% with comparable overhead.
arXiv Detail & Related papers (2022-02-28T09:05:58Z)
How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets. In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset. We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z)
Training Meta-Surrogate Model for Transferable Adversarial Attack [98.13178217557193]
We consider adversarial attacks to a black-box model when no queries are allowed. In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model. We show we can obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models.
arXiv Detail & Related papers (2021-09-05T03:27:46Z)
TREND: Transferability based Robust ENsemble Design [6.663641564969944]
We study the effect of network architecture, input, weight and activation quantization on transferability of adversarial samples. We show that transferability is significantly hampered by input quantization between source and target. We propose a new state-of-the-art ensemble attack to combat this.
arXiv Detail & Related papers (2020-08-04T13:38:14Z)
Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets [83.12737997548645]
Skip connections are an essential component of current state-of-the-art deep neural networks (DNNs) Use of skip connections allows easier generation of highly transferable adversarial examples. We conduct comprehensive transfer attacks against state-of-the-art DNNs including ResNets, DenseNets, Inceptions, Inception-ResNet, Squeeze-and-Excitation Network (SENet)
arXiv Detail & Related papers (2020-02-14T12:09:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.