Diversity can be Transferred: Output Diversification for White- and
Black-box Attacks
- URL: http://arxiv.org/abs/2003.06878v3
- Date: Fri, 30 Oct 2020 00:12:48 GMT
- Title: Diversity can be Transferred: Output Diversification for White- and
Black-box Attacks
- Authors: Yusuke Tashiro, Yang Song, Stefano Ermon
- Abstract summary: Adrial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks.
We propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples.
ODS significantly improves the performance of existing white-box and black-box attacks.
In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.
- Score: 89.92353493977173
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks often involve random perturbations of the inputs drawn
from uniform or Gaussian distributions, e.g., to initialize optimization-based
white-box attacks or generate update directions in black-box attacks. These
simple perturbations, however, could be sub-optimal as they are agnostic to the
model being attacked. To improve the efficiency of these attacks, we propose
Output Diversified Sampling (ODS), a novel sampling strategy that attempts to
maximize diversity in the target model's outputs among the generated samples.
While ODS is a gradient-based strategy, the diversity offered by ODS is
transferable and can be helpful for both white-box and black-box attacks via
surrogate models. Empirically, we demonstrate that ODS significantly improves
the performance of existing white-box and black-box attacks. In particular, ODS
reduces the number of queries needed for state-of-the-art black-box attacks on
ImageNet by a factor of two.
Related papers
- An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial
Transferability [26.39964737311377]
We propose an adaptive ensemble attack, dubbed AdaEA, to adaptively control the fusion of the outputs from each model.
We achieve considerable improvement over the existing ensemble attacks on various datasets.
arXiv Detail & Related papers (2023-08-05T15:12:36Z) - General Adversarial Defense Against Black-box Attacks via Pixel Level
and Feature Level Distribution Alignments [75.58342268895564]
We use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.
The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values.
Our strategy demonstrates its unique effectiveness and generality against black-box attacks.
arXiv Detail & Related papers (2022-12-11T01:51:31Z) - T-SEA: Transfer-based Self-Ensemble Attack on Object Detection [9.794192858806905]
We propose a single-model transfer-based black-box attack on object detection, utilizing only one model to achieve a high-transferability adversarial attack on multiple black-box detectors.
We analogize patch optimization with regular model optimization, proposing a series of self-ensemble approaches on the input data, the attacked model, and the adversarial patch.
arXiv Detail & Related papers (2022-11-16T10:27:06Z) - Boosting Black-Box Adversarial Attacks with Meta Learning [0.0]
We propose a hybrid attack method which trains meta adversarial perturbations (MAPs) on surrogate models and performs black-box attacks by estimating gradients of the models.
Our method can not only improve the attack success rates, but also reduces the number of queries compared to other methods.
arXiv Detail & Related papers (2022-03-28T09:32:48Z) - Cross-Modal Transferable Adversarial Attacks from Images to Videos [82.0745476838865]
Recent studies have shown that adversarial examples hand-crafted on one white-box model can be used to attack other black-box models.
We propose a simple yet effective cross-modal attack method, named as Image To Video (I2V) attack.
I2V generates adversarial frames by minimizing the cosine similarity between features of pre-trained image models from adversarial and benign examples.
arXiv Detail & Related papers (2021-12-10T08:19:03Z) - Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the
Adversarial Transferability [20.255708227671573]
Black-box adversarial attacks can be transferred from one model to another.
In this work, we propose a novel ensemble attack method called the variance reduced ensemble attack.
Empirical results on the standard ImageNet demonstrate that the proposed method could boost the adversarial transferability and outperforms existing ensemble attacks significantly.
arXiv Detail & Related papers (2021-11-21T06:33:27Z) - Meta Gradient Adversarial Attack [64.5070788261061]
This paper proposes a novel architecture called Metaversa Gradient Adrial Attack (MGAA), which is plug-and-play and can be integrated with any existing gradient-based attack method.
Specifically, we randomly sample multiple models from a model zoo to compose different tasks and iteratively simulate a white-box attack and a black-box attack in each task.
By narrowing the gap between the gradient directions in white-box and black-box attacks, the transferability of adversarial examples on the black-box setting can be improved.
arXiv Detail & Related papers (2021-08-09T17:44:19Z) - Gradient-based Adversarial Attacks against Text Transformers [96.73493433809419]
We propose the first general-purpose gradient-based attack against transformer models.
We empirically demonstrate that our white-box attack attains state-of-the-art attack performance on a variety of natural language tasks.
arXiv Detail & Related papers (2021-04-15T17:43:43Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.