GAP++: Learning to generate target-conditioned adversarial examples
- URL: http://arxiv.org/abs/2006.05097v1
- Date: Tue, 9 Jun 2020 07:49:49 GMT
- Title: GAP++: Learning to generate target-conditioned adversarial examples
- Authors: Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue
- Abstract summary: Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models.
We propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label.
Our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.
- Score: 28.894143619182426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples are perturbed inputs which can cause a serious threat
for machine learning models. Finding these perturbations is such a hard task
that we can only use the iterative methods to traverse. For computational
efficiency, recent works use adversarial generative networks to model the
distribution of both the universal or image-dependent perturbations directly.
However, these methods generate perturbations only rely on input images. In
this work, we propose a more general-purpose framework which infers
target-conditioned perturbations dependent on both input image and target
label. Different from previous single-target attack models, our model can
conduct target-conditioned attacks by learning the relations of attack target
and the semantics in image. Using extensive experiments on the datasets of
MNIST and CIFAR10, we show that our method achieves superior performance with
single target attack models and obtains high fooling rates with small
perturbation norms.
Related papers
- Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training [84.95281245784348]
Overemphasizing co-occurrence relationships can cause the overfitting issue of the model.
We provide a causal inference framework to show that the correlative features caused by the target object and its co-occurring objects can be regarded as a mediator.
arXiv Detail & Related papers (2024-04-09T13:13:24Z) - Defense Against Adversarial Attacks using Convolutional Auto-Encoders [0.0]
Adversarial attacks manipulate the input data with imperceptible perturbations, causing the model to misclassify the data or produce erroneous outputs.
This work is based on enhancing the robustness of targeted models against adversarial attacks.
arXiv Detail & Related papers (2023-12-06T14:29:16Z) - On Evaluating the Adversarial Robustness of Semantic Segmentation Models [0.0]
A number of adversarial training approaches have been proposed as a defense against adversarial perturbation.
We show for the first time that a number of models in previous work that are claimed to be robust are in fact not robust at all.
We then evaluate simple adversarial training algorithms that produce reasonably robust models even under our set of strong attacks.
arXiv Detail & Related papers (2023-06-25T11:45:08Z) - Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks [13.374754708543449]
Model attacks (MIAs) aim to create synthetic images that reflect the class-wise characteristics from a target inversion's training data by exploiting the model's learned knowledge.
Previous research has developed generative MIAs using generative adversarial networks (GANs) as image priors tailored to a specific target model.
We present Plug & Play Attacks that loosen the dependency between the target model and image prior and enable the use of a single trained GAN to attack a broad range of targets.
arXiv Detail & Related papers (2022-01-28T15:25:50Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - AdvHaze: Adversarial Haze Attack [19.744435173861785]
We introduce a novel adversarial attack method based on haze, which is a common phenomenon in real-world scenery.
Our method can synthesize potentially adversarial haze into an image based on the atmospheric scattering model with high realisticity.
We demonstrate that the proposed method achieves a high success rate, and holds better transferability across different classification models than the baselines.
arXiv Detail & Related papers (2021-04-28T09:52:25Z) - Counterfactual Generative Networks [59.080843365828756]
We propose to decompose the image generation process into independent causal mechanisms that we train without direct supervision.
By exploiting appropriate inductive biases, these mechanisms disentangle object shape, object texture, and background.
We show that the counterfactual images can improve out-of-distribution with a marginal drop in performance on the original classification task.
arXiv Detail & Related papers (2021-01-15T10:23:12Z) - Luring of transferable adversarial perturbations in the black-box
paradigm [0.0]
We present a new approach to improve the robustness of a model against black-box transfer attacks.
A removable additional neural network is included in the target model, and is designed to induce the textitluring effect.
Our deception-based method only needs to have access to the predictions of the target model and does not require a labeled data set.
arXiv Detail & Related papers (2020-04-10T06:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.