Related papers: Evaluating a Simple Retraining Strategy as a Defense Against Adversarial Attacks

Evaluating a Simple Retraining Strategy as a Defense Against Adversarial Attacks

URL: http://arxiv.org/abs/2007.09916v1
Date: Mon, 20 Jul 2020 07:49:33 GMT
Title: Evaluating a Simple Retraining Strategy as a Defense Against Adversarial Attacks
Authors: Nupur Thakur, Yuzhen Ding, Baoxin Li
Abstract summary: We show how simple algorithms like KNN can be used to determine the labels of the adversarial images needed for retraining. We present the results on two standard datasets namely, CIFAR-10 and TinyImageNet.
Score: 17.709146615433458
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Though deep neural networks (DNNs) have shown superiority over other techniques in major fields like computer vision, natural language processing, robotics, recently, it has been proven that they are vulnerable to adversarial attacks. The addition of a simple, small and almost invisible perturbation to the original input image can be used to fool DNNs into making wrong decisions. With more attack algorithms being designed, a need for defending the neural networks from such attacks arises. Retraining the network with adversarial images is one of the simplest techniques. In this paper, we evaluate the effectiveness of such a retraining strategy in defending against adversarial attacks. We also show how simple algorithms like KNN can be used to determine the labels of the adversarial images needed for retraining. We present the results on two standard datasets namely, CIFAR-10 and TinyImageNet.

Related papers

Robust and Efficient Interference Neural Networks for Defending Against Adversarial Attacks in ImageNet [0.0]
In this paper, we construct an interference neural network by applying additional background images and corresponding labels. Compared with the state-of-the-art results under the PGD attack, it has a better defense effect with much smaller computing resources.
arXiv Detail & Related papers (2023-09-03T14:20:58Z)
SAIF: Sparse Adversarial and Imperceptible Attack Framework [7.025774823899217]
We propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF) Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
arXiv Detail & Related papers (2022-12-14T20:28:50Z)
KATANA: Simple Post-Training Robustness Using Test Time Augmentations [49.28906786793494]
A leading defense against such attacks is adversarial training, a technique in which a DNN is trained to be robust to adversarial attacks. We propose a new simple and easy-to-use technique, KATANA, for robustifying an existing pretrained DNN without modifying its weights. Our strategy achieves state-of-the-art adversarial robustness on diverse attacks with minimal compromise on the natural images' classification.
arXiv Detail & Related papers (2021-09-16T19:16:00Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
A Neuro-Inspired Autoencoding Defense Against Adversarial Perturbations [11.334887948796611]
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks. Most effective current defense is to train the network using adversarially perturbed examples. In this paper, we investigate a radically different, neuro-inspired defense mechanism.
arXiv Detail & Related papers (2020-11-21T21:03:08Z)
GreedyFool: Distortion-Aware Sparse Adversarial Attack [138.55076781355206]
Modern deep neural networks (DNNs) are vulnerable to adversarial samples. Sparse adversarial samples can fool the target model by only perturbing a few pixels. We propose a novel two-stage distortion-aware greedy-based method dubbed as "GreedyFool"
arXiv Detail & Related papers (2020-10-26T17:59:07Z)
Progressive Defense Against Adversarial Attacks for Deep Learning as a Service in Internet of Things [9.753864027359521]
Some Deep Neural Networks (DNN) can be easily misled by adding relatively small but adversarial perturbations to the input. We present a defense strategy called a progressive defense against adversarial attacks (PDAAA) for efficiently and effectively filtering out the adversarial pixel mutations. The result shows it outperforms the state-of-the-art while reducing the cost of model training by 50% on average.
arXiv Detail & Related papers (2020-10-15T06:40:53Z)
Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images. We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks. The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z)
Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction. We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.