Feature Importance Guided Attack: A Model Agnostic Adversarial Attack
- URL: http://arxiv.org/abs/2106.14815v1
- Date: Mon, 28 Jun 2021 15:46:22 GMT
- Title: Feature Importance Guided Attack: A Model Agnostic Adversarial Attack
- Authors: Gilad Gressel, Niranjan Hegde, Archana Sreekumar, and Michael Darling
- Abstract summary: We present the 'Feature Importance Guided Attack' (FIGA) which generates adversarial evasion samples.
We demonstrate FIGA against eight phishing detection models.
We are able to cause a reduction in the F1-score of a phishing detection model from 0.96 to 0.41 on average.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models are susceptible to adversarial attacks which
dramatically reduce their performance. Reliable defenses to these attacks are
an unsolved challenge. In this work, we present a novel evasion attack: the
'Feature Importance Guided Attack' (FIGA) which generates adversarial evasion
samples. FIGA is model agnostic, it assumes no prior knowledge of the defending
model's learning algorithm, but does assume knowledge of the feature
representation. FIGA leverages feature importance rankings; it perturbs the
most important features of the input in the direction of the target class we
wish to mimic. We demonstrate FIGA against eight phishing detection models. We
keep the attack realistic by perturbing phishing website features that an
adversary would have control over. Using FIGA we are able to cause a reduction
in the F1-score of a phishing detection model from 0.96 to 0.41 on average.
Finally, we implement adversarial training as a defense against FIGA and show
that while it is sometimes effective, it can be evaded by changing the
parameters of FIGA.
Related papers
- Revealing Vulnerabilities of Neural Networks in Parameter Learning and Defense Against Explanation-Aware Backdoors [2.1165011830664673]
Blinding attacks can drastically alter a machine learning algorithm's prediction and explanation.
We leverage statistical analysis to highlight the changes in CNN weights within a CNN following blinding attacks.
We introduce a method specifically designed to limit the effectiveness of such attacks during the evaluation phase.
arXiv Detail & Related papers (2024-03-25T09:36:10Z) - Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks.
Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms.
This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Adversary Aware Continual Learning [3.3439097577935213]
Adversary can introduce small amount of misinformation to the model to cause deliberate forgetting of a specific task or class at test time.
We use the attacker's primary strength-hiding the backdoor pattern by making it imperceptible to humans-against it, and propose to learn a perceptible (stronger) pattern that can overpower the attacker's imperceptible pattern.
We show that our proposed defensive framework considerably improves the performance of class incremental learning algorithms with no knowledge of the attacker's target task, attacker's target class, and attacker's imperceptible pattern.
arXiv Detail & Related papers (2023-04-27T19:49:50Z) - Order-Disorder: Imitation Adversarial Attacks for Black-box Neural
Ranking Models [48.93128542994217]
We propose an imitation adversarial attack on black-box neural passage ranking models.
We show that the target passage ranking model can be transparentized and imitated by enumerating critical queries/candidates.
We also propose an innovative gradient-based attack method, empowered by the pairwise objective function, to generate adversarial triggers.
arXiv Detail & Related papers (2022-09-14T09:10:07Z) - On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications.
We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency.
Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z) - Semi-Targeted Model Poisoning Attack on Federated Learning via Backward
Error Analysis [15.172954465350667]
Model poisoning attacks on federated learning (FL) intrude in the entire system via compromising an edge model.
We propose the Attacking Distance-aware Attack (ADA) to enhance a poisoning attack by finding the optimized target class in the feature space.
ADA succeeded in increasing the attack performance by 1.8 times in the most challenging case with an attacking frequency of 0.01.
arXiv Detail & Related papers (2022-03-22T11:40:07Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Manipulating SGD with Data Ordering Attacks [23.639512087220137]
We present a class of training-time attacks that require no changes to the underlying model dataset or architecture.
In particular, an attacker can disrupt the integrity and availability of a model by simply reordering training batches.
Attacks have a long-term impact in that they decrease model performance hundreds of epochs after the attack took place.
arXiv Detail & Related papers (2021-04-19T22:17:27Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Adversarial Detection and Correction by Matching Prediction
Distributions [0.0]
The detector almost completely neutralises powerful attacks like Carlini-Wagner or SLIDE on MNIST and Fashion-MNIST.
We show that our method is still able to detect the adversarial examples in the case of a white-box attack where the attacker has full knowledge of both the model and the defence.
arXiv Detail & Related papers (2020-02-21T15:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.