Related papers: Transferable Perturbations of Deep Feature Distributions

Transferable Perturbations of Deep Feature Distributions

URL: http://arxiv.org/abs/2004.12519v1
Date: Mon, 27 Apr 2020 00:32:25 GMT
Title: Transferable Perturbations of Deep Feature Distributions
Authors: Nathan Inkawhich, Kevin J Liang, Lawrence Carin and Yiran Chen
Abstract summary: This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models.
Score: 102.94094966908916
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models. Further, we place a priority on explainability and interpretability of the attacking process. Our methodology affords an analysis of how adversarial attacks change the intermediate feature distributions of CNNs, as well as a measure of layer-wise and class-wise feature distributional separability/entanglement. We also conceptualize a transition from task/data-specific to model-specific features within a CNN architecture that directly impacts the transferability of adversarial examples.

Related papers

Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
FSPGD: Rethinking Black-box Attacks on Semantic Segmentation [4.82982114295171]
We introduce the Feature Similarity Projected Gradient Descent (FSPGD) attack, a novel black-box approach that enhances both attack performance and transferability. Our method targets local information by comparing features between clean images and adversarial examples. Experiments on Pascal VOC 2012 and Cityscapes datasets demonstrate that FSPGD achieves superior transferability and attack performance.
arXiv Detail & Related papers (2025-02-03T11:36:01Z)
A Study on Transferability of Deep Learning Models for Network Intrusion Detection [11.98319841778396]
We evaluate transferability of attack classes by training a deep learning model with a specific attack class and testing it on a separate attack class. We observe the effects of real and synthetically generated data augmentation techniques on transferability.
arXiv Detail & Related papers (2023-12-17T05:06:20Z)
Common Knowledge Learning for Generating Transferable Adversarial Examples [60.1287733223249]
This paper focuses on an important type of black-box attacks, where the adversary generates adversarial examples by a substitute (source) model. Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures. We propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples.
arXiv Detail & Related papers (2023-07-01T09:07:12Z)
Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models [9.148791330175191]
Convolutional neural network (CNN) models have seen advanced improvements in performance in various domains. Lack of interpretability is a major barrier to assurance and regulation during operation for acceptance and deployment of AI-assisted applications. We propose a novel hybrid CNN-interpreter through:. An original forward propagation mechanism to examine the layer-specific prediction results for local interpretability. A new global interpretability that indicates the feature correlation and filter importance effects.
arXiv Detail & Related papers (2022-10-31T22:59:33Z)
Towards Understanding and Boosting Adversarial Transferability from a Distribution Perspective [80.02256726279451]
adversarial attacks against Deep neural networks (DNNs) have received broad attention in recent years. We propose a novel method that crafts adversarial examples by manipulating the distribution of the image. Our method can significantly improve the transferability of the crafted attacks and achieves state-of-the-art performance in both untargeted and targeted scenarios.
arXiv Detail & Related papers (2022-10-09T09:58:51Z)
Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z)
TREND: Transferability based Robust ENsemble Design [6.663641564969944]
We study the effect of network architecture, input, weight and activation quantization on transferability of adversarial samples. We show that transferability is significantly hampered by input quantization between source and target. We propose a new state-of-the-art ensemble attack to combat this.
arXiv Detail & Related papers (2020-08-04T13:38:14Z)
Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs) We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases. Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
Perturbing Across the Feature Hierarchy to Improve Standard and Strict Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance. We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.