General Adversarial Defense Against Black-box Attacks via Pixel Level
and Feature Level Distribution Alignments
- URL: http://arxiv.org/abs/2212.05387v1
- Date: Sun, 11 Dec 2022 01:51:31 GMT
- Title: General Adversarial Defense Against Black-box Attacks via Pixel Level
and Feature Level Distribution Alignments
- Authors: Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia
- Abstract summary: We use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.
The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values.
Our strategy demonstrates its unique effectiveness and generality against black-box attacks.
- Score: 75.58342268895564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) are vulnerable to the black-box adversarial
attack that is highly transferable. This threat comes from the distribution gap
between adversarial and clean samples in feature space of the target DNNs. In
this paper, we use Deep Generative Networks (DGNs) with a novel training
mechanism to eliminate the distribution gap. The trained DGNs align the
distribution of adversarial samples with clean ones for the target DNNs by
translating pixel values. Different from previous work, we propose a more
effective pixel level training constraint to make this achievable, thus
enhancing robustness on adversarial samples. Further, a class-aware
feature-level constraint is formulated for integrated distribution alignment.
Our approach is general and applicable to multiple tasks, including image
classification, semantic segmentation, and object detection. We conduct
extensive experiments on different datasets. Our strategy demonstrates its
unique effectiveness and generality against black-box attacks.
Related papers
- Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection [83.72430401516674]
GAKer is able to construct adversarial examples to any target class.
Our method achieves an approximately $14.13%$ higher attack success rate for unknown classes.
arXiv Detail & Related papers (2024-07-17T03:24:09Z) - Robustness Against Adversarial Attacks via Learning Confined Adversarial
Polytopes [0.0]
Deep neural networks (DNNs) could be deceived by generating human-imperceptible perturbations of clean samples.
In this paper, we aim to train robust DNNs by limiting the set of outputs reachable via a norm-bounded perturbation added to a clean sample.
arXiv Detail & Related papers (2024-01-15T22:31:15Z) - Boosting Adversarial Transferability via Fusing Logits of Top-1
Decomposed Feature [36.78292952798531]
We propose a Singular Value Decomposition (SVD)-based feature-level attack method.
Our approach is inspired by the discovery that eigenvectors associated with the larger singular values from the middle layer features exhibit superior generalization and attention properties.
arXiv Detail & Related papers (2023-05-02T12:27:44Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.