A Neuro-Inspired Autoencoding Defense Against Adversarial Perturbations
- URL: http://arxiv.org/abs/2011.10867v2
- Date: Mon, 21 Dec 2020 23:35:47 GMT
- Title: A Neuro-Inspired Autoencoding Defense Against Adversarial Perturbations
- Authors: Can Bakiskan, Metehan Cekic, Ahmet Dundar Sezer, Upamanyu Madhow
- Abstract summary: Deep Neural Networks (DNNs) are vulnerable to adversarial attacks.
Most effective current defense is to train the network using adversarially perturbed examples.
In this paper, we investigate a radically different, neuro-inspired defense mechanism.
- Score: 11.334887948796611
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks (DNNs) are vulnerable to adversarial attacks: carefully
constructed perturbations to an image can seriously impair classification
accuracy, while being imperceptible to humans. While there has been a
significant amount of research on defending against such attacks, most defenses
based on systematic design principles have been defeated by appropriately
modified attacks. For a fixed set of data, the most effective current defense
is to train the network using adversarially perturbed examples. In this paper,
we investigate a radically different, neuro-inspired defense mechanism,
starting from the observation that human vision is virtually unaffected by
adversarial examples designed for machines. We aim to reject L^inf bounded
adversarial perturbations before they reach a classifier DNN, using an encoder
with characteristics commonly observed in biological vision: sparse
overcomplete representations, randomness due to synaptic noise, and drastic
nonlinearities. Encoder training is unsupervised, using standard dictionary
learning. A CNN-based decoder restores the size of the encoder output to that
of the original image, enabling the use of a standard CNN for classification.
Our nominal design is to train the decoder and classifier together in standard
supervised fashion, but we also consider unsupervised decoder training based on
a regression objective (as in a conventional autoencoder) with separate
supervised training of the classifier. Unlike adversarial training, all
training is based on clean images.
Our experiments on the CIFAR-10 show performance competitive with
state-of-the-art defenses based on adversarial training, and point to the
promise of neuro-inspired techniques for the design of robust neural networks.
In addition, we provide results for a subset of the Imagenet dataset to verify
that our approach scales to larger images.
Related papers
- Downstream-agnostic Adversarial Examples [66.8606539786026]
AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder.
Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels.
Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
arXiv Detail & Related papers (2023-07-23T10:16:47Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Learning from Attacks: Attacking Variational Autoencoder for Improving
Image Classification [17.881134865491063]
Adversarial attacks are often considered as threats to the robustness of Deep Neural Networks (DNNs)
This work analyzes adversarial attacks from a different perspective. Namely, adversarial examples contain implicit information that is useful to the predictions.
We propose an algorithmic framework that leverages the advantages of the DNNs for data self-expression and task-specific predictions.
arXiv Detail & Related papers (2022-03-11T08:48:26Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Defending Adversarial Examples via DNN Bottleneck Reinforcement [20.08619981108837]
This paper presents a reinforcement scheme to alleviate the vulnerability of Deep Neural Networks (DNN) against adversarial attacks.
By reinforcing the former while maintaining the latter, any redundant information, be it adversarial or not, should be removed from the latent representation.
In order to reinforce the information bottleneck, we introduce the multi-scale low-pass objective and multi-scale high-frequency communication for better frequency steering in the network.
arXiv Detail & Related papers (2020-08-12T11:02:01Z) - Enhancing Intrinsic Adversarial Robustness via Feature Pyramid Decoder [11.701729403940798]
We propose an attack-agnostic defence framework to enhance the intrinsic robustness of neural networks.
Our framework applies to all block-based convolutional neural networks (CNNs)
arXiv Detail & Related papers (2020-05-06T01:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.