Identification of Attack-Specific Signatures in Adversarial Examples
- URL: http://arxiv.org/abs/2110.06802v1
- Date: Wed, 13 Oct 2021 15:40:48 GMT
- Title: Identification of Attack-Specific Signatures in Adversarial Examples
- Authors: Hossein Souri, Pirazh Khorramshahi, Chun Pong Lau, Micah Goldblum,
Rama Chellappa
- Abstract summary: We show that different attack algorithms produce adversarial examples which are distinct not only in their effectiveness but also in how they qualitatively affect their victims.
Our findings suggest that prospective adversarial attacks should be compared not only via their success rates at fooling models but also via deeper downstream effects they have on victims.
- Score: 62.17639067715379
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The adversarial attack literature contains a myriad of algorithms for
crafting perturbations which yield pathological behavior in neural networks. In
many cases, multiple algorithms target the same tasks and even enforce the same
constraints. In this work, we show that different attack algorithms produce
adversarial examples which are distinct not only in their effectiveness but
also in how they qualitatively affect their victims. We begin by demonstrating
that one can determine the attack algorithm that crafted an adversarial
example. Then, we leverage recent advances in parameter-space saliency maps to
show, both visually and quantitatively, that adversarial attack algorithms
differ in which parts of the network and image they target. Our findings
suggest that prospective adversarial attacks should be compared not only via
their success rates at fooling models but also via deeper downstream effects
they have on victims.
Related papers
- Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis [12.133306321357999]
We propose an uncertainty-based method for detecting adversarial attacks on neural networks for semantic segmentation.
We conduct a detailed analysis of uncertainty-based detection of adversarial attacks and various state-of-the-art neural networks.
Our numerical experiments show the effectiveness of the proposed uncertainty-based detection method.
arXiv Detail & Related papers (2024-08-19T14:13:30Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Uncertainty-based Detection of Adversarial Attacks in Semantic
Segmentation [16.109860499330562]
We introduce an uncertainty-based approach for the detection of adversarial attacks in semantic segmentation.
We demonstrate the ability of our approach to detect perturbed images across multiple types of adversarial attacks.
arXiv Detail & Related papers (2023-05-22T08:36:35Z) - Deviations in Representations Induced by Adversarial Attacks [0.0]
Research has shown that deep learning models are vulnerable to adversarial attacks.
This finding brought about a new direction in research, whereby algorithms were developed to attack and defend vulnerable networks.
We present a method for measuring and analyzing the deviations in representations induced by adversarial attacks.
arXiv Detail & Related papers (2022-11-07T17:40:08Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Adversarial Examples Detection beyond Image Space [88.7651422751216]
We find that there exists compliance between perturbations and prediction confidence, which guides us to detect few-perturbation attacks from the aspect of prediction confidence.
We propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.
arXiv Detail & Related papers (2021-02-23T09:55:03Z) - MixNet for Generalized Face Presentation Attack Detection [63.35297510471997]
We have proposed a deep learning-based network termed as textitMixNet to detect presentation attacks.
The proposed algorithm utilizes state-of-the-art convolutional neural network architectures and learns the feature mapping for each attack category.
arXiv Detail & Related papers (2020-10-25T23:01:13Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.