Related papers: Diversity can be Transferred: Output Diversification for White- and Black-box Attacks

Diversity can be Transferred: Output Diversification for White- and Black-box Attacks

URL: http://arxiv.org/abs/2003.06878v3
Date: Fri, 30 Oct 2020 00:12:48 GMT
Title: Diversity can be Transferred: Output Diversification for White- and Black-box Attacks
Authors: Yusuke Tashiro, Yang Song, Stefano Ermon
Abstract summary: Adrial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks. We propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples. ODS significantly improves the performance of existing white-box and black-box attacks. In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.
Score: 89.92353493977173
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks. These simple perturbations, however, could be sub-optimal as they are agnostic to the model being attacked. To improve the efficiency of these attacks, we propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples. While ODS is a gradient-based strategy, the diversity offered by ODS is transferable and can be helpful for both white-box and black-box attacks via surrogate models. Empirically, we demonstrate that ODS significantly improves the performance of existing white-box and black-box attacks. In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.

Related papers

White-box Membership Inference Attacks against Diffusion Models [13.425726946466423]
Diffusion models have begun to overshadow GANs in industrial applications due to their superior image generation performance. We aim to design membership inference attacks (MIAs) catered to diffusion models. We first conduct an exhaustive analysis of existing MIAs on diffusion models, taking into account factors such as black-box/white-box models and the selection of attack features. We found that white-box attacks are highly applicable in real-world scenarios, and the most effective attacks presently are white-box.
arXiv Detail & Related papers (2023-08-11T22:03:36Z)
General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments [75.58342268895564]
We use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap. The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values. Our strategy demonstrates its unique effectiveness and generality against black-box attacks.
arXiv Detail & Related papers (2022-12-11T01:51:31Z)
T-SEA: Transfer-based Self-Ensemble Attack on Object Detection [9.794192858806905]
We propose a single-model transfer-based black-box attack on object detection, utilizing only one model to achieve a high-transferability adversarial attack on multiple black-box detectors. We analogize patch optimization with regular model optimization, proposing a series of self-ensemble approaches on the input data, the attacked model, and the adversarial patch.
arXiv Detail & Related papers (2022-11-16T10:27:06Z)
Boosting Black-Box Adversarial Attacks with Meta Learning [0.0]
We propose a hybrid attack method which trains meta adversarial perturbations (MAPs) on surrogate models and performs black-box attacks by estimating gradients of the models. Our method can not only improve the attack success rates, but also reduces the number of queries compared to other methods.
arXiv Detail & Related papers (2022-03-28T09:32:48Z)
Cross-Modal Transferable Adversarial Attacks from Images to Videos [82.0745476838865]
Recent studies have shown that adversarial examples hand-crafted on one white-box model can be used to attack other black-box models. We propose a simple yet effective cross-modal attack method, named as Image To Video (I2V) attack. I2V generates adversarial frames by minimizing the cosine similarity between features of pre-trained image models from adversarial and benign examples.
arXiv Detail & Related papers (2021-12-10T08:19:03Z)
Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability [20.255708227671573]
Black-box adversarial attacks can be transferred from one model to another. In this work, we propose a novel ensemble attack method called the variance reduced ensemble attack. Empirical results on the standard ImageNet demonstrate that the proposed method could boost the adversarial transferability and outperforms existing ensemble attacks significantly.
arXiv Detail & Related papers (2021-11-21T06:33:27Z)
Meta Gradient Adversarial Attack [64.5070788261061]
This paper proposes a novel architecture called Metaversa Gradient Adrial Attack (MGAA), which is plug-and-play and can be integrated with any existing gradient-based attack method. Specifically, we randomly sample multiple models from a model zoo to compose different tasks and iteratively simulate a white-box attack and a black-box attack in each task. By narrowing the gap between the gradient directions in white-box and black-box attacks, the transferability of adversarial examples on the black-box setting can be improved.
arXiv Detail & Related papers (2021-08-09T17:44:19Z)
Gradient-based Adversarial Attacks against Text Transformers [96.73493433809419]
We propose the first general-purpose gradient-based attack against transformer models. We empirically demonstrate that our white-box attack attains state-of-the-art attack performance on a variety of natural language tasks.
arXiv Detail & Related papers (2021-04-15T17:43:43Z)
Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs) We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases. Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.