ManiGen: A Manifold Aided Black-box Generator of Adversarial Examples
- URL: http://arxiv.org/abs/2007.05817v1
- Date: Sat, 11 Jul 2020 17:34:17 GMT
- Title: ManiGen: A Manifold Aided Black-box Generator of Adversarial Examples
- Authors: Guanxiong Liu, Issa Khalil, Abdallah Khreishah, Abdulelah Algosaibi,
Adel Aldalbahi, Mohammed Alaneem, Abdulaziz Alhumam, Mohammed Anan
- Abstract summary: Adrial examples are inputs with special perturbations ignored by human eyes.
ManiGen generates adversarial examples by searching along the manifold.
ManiGen can more effectively attack classifiers with state-of-the-art defenses.
- Score: 4.509133544449485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models, especially neural network (NN) classifiers, have
acceptable performance and accuracy that leads to their wide adoption in
different aspects of our daily lives. The underlying assumption is that these
models are generated and used in attack free scenarios. However, it has been
shown that neural network based classifiers are vulnerable to adversarial
examples. Adversarial examples are inputs with special perturbations that are
ignored by human eyes while can mislead NN classifiers. Most of the existing
methods for generating such perturbations require a certain level of knowledge
about the target classifier, which makes them not very practical. For example,
some generators require knowledge of pre-softmax logits while others utilize
prediction scores.
In this paper, we design a practical black-box adversarial example generator,
dubbed ManiGen. ManiGen does not require any knowledge of the inner state of
the target classifier. It generates adversarial examples by searching along the
manifold, which is a concise representation of input data. Through extensive
set of experiments on different datasets, we show that (1) adversarial examples
generated by ManiGen can mislead standalone classifiers by being as successful
as the state-of-the-art white-box generator, Carlini, and (2) adversarial
examples generated by ManiGen can more effectively attack classifiers with
state-of-the-art defenses.
Related papers
- Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection [83.72430401516674]
GAKer is able to construct adversarial examples to any target class.
Our method achieves an approximately $14.13%$ higher attack success rate for unknown classes.
arXiv Detail & Related papers (2024-07-17T03:24:09Z) - Forging the Forger: An Attempt to Improve Authorship Verification via Data Augmentation [52.72682366640554]
Authorship Verification (AV) is a text classification task concerned with inferring whether a candidate text has been written by one specific author or by someone else.
It has been shown that many AV systems are vulnerable to adversarial attacks, where a malicious author actively tries to fool the classifier by either concealing their writing style, or by imitating the style of another author.
arXiv Detail & Related papers (2024-03-17T16:36:26Z) - NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as
Artificial Adversaries? [61.58261351116679]
We introduce a two-stage adversarial example generation framework (NaturalAdversaries) for natural language understanding tasks.
It is adaptable to both black-box and white-box adversarial attacks based on the level of access to the model parameters.
Our results indicate these adversaries generalize across domains, and offer insights for future research on improving robustness of neural text classification models.
arXiv Detail & Related papers (2022-11-08T16:37:34Z) - Towards Generating Adversarial Examples on Mixed-type Data [32.41305735919529]
We propose a novel attack algorithm M-Attack, which can effectively generate adversarial examples in mixed-type data.
Based on M-Attack, attackers can attempt to mislead the targeted classification model's prediction, by only slightly perturbing both the numerical and categorical features in the given data samples.
Our generated adversarial examples can evade potential detection models, which makes the attack indeed insidious.
arXiv Detail & Related papers (2022-10-17T20:17:21Z) - ECINN: Efficient Counterfactuals from Invertible Neural Networks [80.94500245955591]
We propose a method, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently.
ECINN has a closed-form expression and generates a counterfactual in the time of only two evaluations.
Our experiments demonstrate how ECINN alters class-dependent image regions to change the perceptual and predicted class of the counterfactuals.
arXiv Detail & Related papers (2021-03-25T09:23:24Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Differentiable Language Model Adversarial Attacks on Categorical
Sequence Classifiers [0.0]
An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models.
We use a fine-tuning of a language model for adversarial attacks as a generator of adversarial examples.
Our model works for diverse datasets on bank transactions, electronic health records, and NLP datasets.
arXiv Detail & Related papers (2020-06-19T11:25:36Z) - Adversarial Machine Learning in Network Intrusion Detection Systems [6.18778092044887]
We study the nature of the adversarial problem in Network Intrusion Detection Systems.
We use evolutionary computation (particle swarm optimization and genetic algorithm) and deep learning (generative adversarial networks) as tools for adversarial example generation.
Our work highlights the vulnerability of machine learning based NIDS in the face of adversarial perturbation.
arXiv Detail & Related papers (2020-04-23T19:47:43Z) - Generating Natural Adversarial Hyperspectral examples with a modified
Wasserstein GAN [0.0]
We present a new method which is able to generate natural adversarial examples from the true data following the second paradigm.
We provide a proof of concept of our method by generating adversarial hyperspectral signatures on a remote sensing dataset.
arXiv Detail & Related papers (2020-01-27T07:32:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.