Evaluating a Simple Retraining Strategy as a Defense Against Adversarial
Attacks
- URL: http://arxiv.org/abs/2007.09916v1
- Date: Mon, 20 Jul 2020 07:49:33 GMT
- Title: Evaluating a Simple Retraining Strategy as a Defense Against Adversarial
Attacks
- Authors: Nupur Thakur, Yuzhen Ding, Baoxin Li
- Abstract summary: We show how simple algorithms like KNN can be used to determine the labels of the adversarial images needed for retraining.
We present the results on two standard datasets namely, CIFAR-10 and TinyImageNet.
- Score: 17.709146615433458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Though deep neural networks (DNNs) have shown superiority over other
techniques in major fields like computer vision, natural language processing,
robotics, recently, it has been proven that they are vulnerable to adversarial
attacks. The addition of a simple, small and almost invisible perturbation to
the original input image can be used to fool DNNs into making wrong decisions.
With more attack algorithms being designed, a need for defending the neural
networks from such attacks arises. Retraining the network with adversarial
images is one of the simplest techniques. In this paper, we evaluate the
effectiveness of such a retraining strategy in defending against adversarial
attacks. We also show how simple algorithms like KNN can be used to determine
the labels of the adversarial images needed for retraining. We present the
results on two standard datasets namely, CIFAR-10 and TinyImageNet.
Related papers
- Robust and Efficient Interference Neural Networks for Defending Against
Adversarial Attacks in ImageNet [0.0]
In this paper, we construct an interference neural network by applying additional background images and corresponding labels.
Compared with the state-of-the-art results under the PGD attack, it has a better defense effect with much smaller computing resources.
arXiv Detail & Related papers (2023-09-03T14:20:58Z) - SAIF: Sparse Adversarial and Imperceptible Attack Framework [7.025774823899217]
We propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF)
Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers.
SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
arXiv Detail & Related papers (2022-12-14T20:28:50Z) - KATANA: Simple Post-Training Robustness Using Test Time Augmentations [49.28906786793494]
A leading defense against such attacks is adversarial training, a technique in which a DNN is trained to be robust to adversarial attacks.
We propose a new simple and easy-to-use technique, KATANA, for robustifying an existing pretrained DNN without modifying its weights.
Our strategy achieves state-of-the-art adversarial robustness on diverse attacks with minimal compromise on the natural images' classification.
arXiv Detail & Related papers (2021-09-16T19:16:00Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - A Neuro-Inspired Autoencoding Defense Against Adversarial Perturbations [11.334887948796611]
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks.
Most effective current defense is to train the network using adversarially perturbed examples.
In this paper, we investigate a radically different, neuro-inspired defense mechanism.
arXiv Detail & Related papers (2020-11-21T21:03:08Z) - GreedyFool: Distortion-Aware Sparse Adversarial Attack [138.55076781355206]
Modern deep neural networks (DNNs) are vulnerable to adversarial samples.
Sparse adversarial samples can fool the target model by only perturbing a few pixels.
We propose a novel two-stage distortion-aware greedy-based method dubbed as "GreedyFool"
arXiv Detail & Related papers (2020-10-26T17:59:07Z) - Progressive Defense Against Adversarial Attacks for Deep Learning as a
Service in Internet of Things [9.753864027359521]
Some Deep Neural Networks (DNN) can be easily misled by adding relatively small but adversarial perturbations to the input.
We present a defense strategy called a progressive defense against adversarial attacks (PDAAA) for efficiently and effectively filtering out the adversarial pixel mutations.
The result shows it outperforms the state-of-the-art while reducing the cost of model training by 50% on average.
arXiv Detail & Related papers (2020-10-15T06:40:53Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Towards Achieving Adversarial Robustness by Enforcing Feature
Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction.
We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.