Related papers: GAP++: Learning to generate target-conditioned adversarial examples

GAP++: Learning to generate target-conditioned adversarial examples

URL: http://arxiv.org/abs/2006.05097v1
Date: Tue, 9 Jun 2020 07:49:49 GMT
Title: GAP++: Learning to generate target-conditioned adversarial examples
Authors: Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue
Abstract summary: Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models. We propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label. Our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.
Score: 28.894143619182426
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models. Finding these perturbations is such a hard task that we can only use the iterative methods to traverse. For computational efficiency, recent works use adversarial generative networks to model the distribution of both the universal or image-dependent perturbations directly. However, these methods generate perturbations only rely on input images. In this work, we propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label. Different from previous single-target attack models, our model can conduct target-conditioned attacks by learning the relations of attack target and the semantics in image. Using extensive experiments on the datasets of MNIST and CIFAR10, we show that our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.

Related papers

Embedding Hidden Adversarial Capabilities in Pre-Trained Diffusion Models [1.534667887016089]
We introduce a new attack paradigm that embeds hidden adversarial capabilities directly into diffusion models via fine-tuning. The resulting tampered model generates high-quality images indistinguishable from those of the original. We demonstrate the effectiveness and stealthiness of our approach, uncovering a covert attack vector that raises new security concerns.
arXiv Detail & Related papers (2025-04-05T12:51:36Z)
Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM) To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training [84.95281245784348]
Overemphasizing co-occurrence relationships can cause the overfitting issue of the model. We provide a causal inference framework to show that the correlative features caused by the target object and its co-occurring objects can be regarded as a mediator.
arXiv Detail & Related papers (2024-04-09T13:13:24Z)
Defense Against Adversarial Attacks using Convolutional Auto-Encoders [0.0]
Adversarial attacks manipulate the input data with imperceptible perturbations, causing the model to misclassify the data or produce erroneous outputs. This work is based on enhancing the robustness of targeted models against adversarial attacks.
arXiv Detail & Related papers (2023-12-06T14:29:16Z)
On Evaluating the Adversarial Robustness of Semantic Segmentation Models [0.0]
A number of adversarial training approaches have been proposed as a defense against adversarial perturbation. We show for the first time that a number of models in previous work that are claimed to be robust are in fact not robust at all. We then evaluate simple adversarial training algorithms that produce reasonably robust models even under our set of strong attacks.
arXiv Detail & Related papers (2023-06-25T11:45:08Z)
Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks [13.374754708543449]
Model attacks (MIAs) aim to create synthetic images that reflect the class-wise characteristics from a target inversion's training data by exploiting the model's learned knowledge. Previous research has developed generative MIAs using generative adversarial networks (GANs) as image priors tailored to a specific target model. We present Plug & Play Attacks that loosen the dependency between the target model and image prior and enable the use of a single trained GAN to attack a broad range of targets.
arXiv Detail & Related papers (2022-01-28T15:25:50Z)
Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks. We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering. Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z)
AdvHaze: Adversarial Haze Attack [19.744435173861785]
We introduce a novel adversarial attack method based on haze, which is a common phenomenon in real-world scenery. Our method can synthesize potentially adversarial haze into an image based on the atmospheric scattering model with high realisticity. We demonstrate that the proposed method achieves a high success rate, and holds better transferability across different classification models than the baselines.
arXiv Detail & Related papers (2021-04-28T09:52:25Z)
Counterfactual Generative Networks [59.080843365828756]
We propose to decompose the image generation process into independent causal mechanisms that we train without direct supervision. By exploiting appropriate inductive biases, these mechanisms disentangle object shape, object texture, and background. We show that the counterfactual images can improve out-of-distribution with a marginal drop in performance on the original classification task.
arXiv Detail & Related papers (2021-01-15T10:23:12Z)
Luring of transferable adversarial perturbations in the black-box paradigm [0.0]
We present a new approach to improve the robustness of a model against black-box transfer attacks. A removable additional neural network is included in the target model, and is designed to induce the textitluring effect. Our deception-based method only needs to have access to the predictions of the target model and does not require a labeled data set.
arXiv Detail & Related papers (2020-04-10T06:48:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.