Related papers: Semantic Preserving Adversarial Attack Generation with Autoencoder and Genetic Algorithm

Semantic Preserving Adversarial Attack Generation with Autoencoder and Genetic Algorithm

URL: http://arxiv.org/abs/2208.12230v1
Date: Thu, 25 Aug 2022 17:27:26 GMT
Title: Semantic Preserving Adversarial Attack Generation with Autoencoder and Genetic Algorithm
Authors: Xinyi Wang, Simon Yusuf Enoch, Dong Seong Kim
Abstract summary: Little noises can fool state-of-the-art models into making incorrect predictions. We propose a black-box attack, which modifies latent features of data extracted by an autoencoder. We trained autoencoders on MNIST and CIFAR-10 datasets and found optimal adversarial perturbations using a genetic algorithm.
Score: 29.613411948228563
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Widely used deep learning models are found to have poor robustness. Little noises can fool state-of-the-art models into making incorrect predictions. While there is a great deal of high-performance attack generation methods, most of them directly add perturbations to original data and measure them using L_p norms; this can break the major structure of data, thus, creating invalid attacks. In this paper, we propose a black-box attack, which, instead of modifying original data, modifies latent features of data extracted by an autoencoder; then, we measure noises in semantic space to protect the semantics of data. We trained autoencoders on MNIST and CIFAR-10 datasets and found optimal adversarial perturbations using a genetic algorithm. Our approach achieved a 100% attack success rate on the first 100 data of MNIST and CIFAR-10 datasets with less perturbation than FGSM.

Related papers

Malicious URL Detection using optimized Hist Gradient Boosting Classifier based on grid search method [0.0]
Trusting the accuracy of data inputted on online platforms can be difficult due to the possibility of malicious websites gathering information for unlawful reasons. To detect the risk posed by malicious websites, it is proposed to utilize Machine Learning (ML)-based techniques. The dataset used contains 1781 records of malicious benign website data with 13 features.
arXiv Detail & Related papers (2024-06-12T11:16:30Z)
Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA) For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified. We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z)
One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks [28.502489028888608]
Unlearnable examples (ULEs) aim to protect data from unauthorized usage for training DNNs. In adversarial training, the unlearnability of error-minimizing noise will severely degrade. We propose a novel model-free method, named emphOne-Pixel Shortcut, which only perturbs a single pixel of each image and makes the dataset unlearnable.
arXiv Detail & Related papers (2022-05-24T15:17:52Z)
DAD: Data-free Adversarial Defense at Test Time [21.741026088202126]
Deep models are highly susceptible to adversarial attacks. Privacy has become an important concern, restricting access to only trained models but not the training data. We propose a completely novel problem of 'test-time adversarial defense in absence of training data and even their statistics'
arXiv Detail & Related papers (2022-04-04T15:16:13Z)
Meta Adversarial Perturbations [66.43754467275967]
We show the existence of a meta adversarial perturbation (MAP) MAP causes natural images to be misclassified with high probability after being updated through only a one-step gradient ascent update. We show that these perturbations are not only image-agnostic, but also model-agnostic, as a single perturbation generalizes well across unseen data points and different neural network architectures.
arXiv Detail & Related papers (2021-11-19T16:01:45Z)
MINIMAL: Mining Models for Data Free Universal Adversarial Triggers [57.14359126600029]
We present a novel data-free approach, MINIMAL, to mine input-agnostic adversarial triggers from NLP models. We reduce the accuracy of Stanford Sentiment Treebank's positive class from 93.6% to 9.6%. For the Stanford Natural Language Inference (SNLI), our single-word trigger reduces the accuracy of the entailment class from 90.95% to less than 0.6%.
arXiv Detail & Related papers (2021-09-25T17:24:48Z)
Semantic Perturbations with Normalizing Flows for Improved Generalization [62.998818375912506]
We show that perturbations in the latent space can be used to define fully unsupervised data augmentations. We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
arXiv Detail & Related papers (2021-08-18T03:20:00Z)
Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels. Recent efforts combine it with another l_infty perturbation on magnitudes. We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images) We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples. Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.