Deviations in Representations Induced by Adversarial Attacks
- URL: http://arxiv.org/abs/2211.03714v1
- Date: Mon, 7 Nov 2022 17:40:08 GMT
- Title: Deviations in Representations Induced by Adversarial Attacks
- Authors: Daniel Steinberg, Paul Munro
- Abstract summary: Research has shown that deep learning models are vulnerable to adversarial attacks.
This finding brought about a new direction in research, whereby algorithms were developed to attack and defend vulnerable networks.
We present a method for measuring and analyzing the deviations in representations induced by adversarial attacks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has been a popular topic and has achieved success in many
areas. It has drawn the attention of researchers and machine learning
practitioners alike, with developed models deployed to a variety of settings.
Along with its achievements, research has shown that deep learning models are
vulnerable to adversarial attacks. This finding brought about a new direction
in research, whereby algorithms were developed to attack and defend vulnerable
networks. Our interest is in understanding how these attacks effect change on
the intermediate representations of deep learning models. We present a method
for measuring and analyzing the deviations in representations induced by
adversarial attacks, progressively across a selected set of layers. Experiments
are conducted using an assortment of attack algorithms, on the CIFAR-10
dataset, with plots created to visualize the impact of adversarial attacks
across different layers in a network.
Related papers
- Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - How Deep Learning Sees the World: A Survey on Adversarial Attacks &
Defenses [0.0]
This paper compiles the most recent adversarial attacks, grouped by the attacker capacity, and modern defenses clustered by protection strategies.
We also present the new advances regarding Vision Transformers, summarize the datasets and metrics used in the context of adversarial settings, and compare the state-of-the-art results under different attacks, finishing with the identification of open issues.
arXiv Detail & Related papers (2023-05-18T10:33:28Z) - Interpretations Cannot Be Trusted: Stealthy and Effective Adversarial
Perturbations against Interpretable Deep Learning [16.13790238416691]
This work introduces two attacks, AdvEdge and AdvEdge$+$, that deceive both the target deep learning model and the coupled interpretation model.
Our analysis shows the effectiveness of our attacks in terms of deceiving the deep learning models and their interpreters.
arXiv Detail & Related papers (2022-11-29T04:45:10Z) - Recent improvements of ASR models in the face of adversarial attacks [28.934863462633636]
Speech Recognition models are vulnerable to adversarial attacks.
We show that the relative strengths of different attack algorithms vary considerably when changing the model architecture.
We release our source code as a package that should help future research in evaluating their attacks and defenses.
arXiv Detail & Related papers (2022-03-29T22:40:37Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Adversarial Robustness of Deep Reinforcement Learning based Dynamic
Recommender Systems [50.758281304737444]
We propose to explore adversarial examples and attack detection on reinforcement learning-based interactive recommendation systems.
We first craft different types of adversarial examples by adding perturbations to the input and intervening on the casual factors.
Then, we augment recommendation systems by detecting potential attacks with a deep learning-based classifier based on the crafted data.
arXiv Detail & Related papers (2021-12-02T04:12:24Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Identification of Attack-Specific Signatures in Adversarial Examples [62.17639067715379]
We show that different attack algorithms produce adversarial examples which are distinct not only in their effectiveness but also in how they qualitatively affect their victims.
Our findings suggest that prospective adversarial attacks should be compared not only via their success rates at fooling models but also via deeper downstream effects they have on victims.
arXiv Detail & Related papers (2021-10-13T15:40:48Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Detection Defense Against Adversarial Attacks with Saliency Map [7.736844355705379]
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision.
Existing defenses are trend to harden the robustness of models against adversarial attacks.
We propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples.
arXiv Detail & Related papers (2020-09-06T13:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.