Two Souls in an Adversarial Image: Towards Universal Adversarial Example
Detection using Multi-view Inconsistency
- URL: http://arxiv.org/abs/2109.12459v1
- Date: Sat, 25 Sep 2021 23:47:13 GMT
- Title: Two Souls in an Adversarial Image: Towards Universal Adversarial Example
Detection using Multi-view Inconsistency
- Authors: Sohaib Kiani, Sana Awan, Chao Lan, Fengjun Li, Bo Luo
- Abstract summary: In evasion attacks against deep neural networks (DNN), the attacker generates adversarial instances that are visually indistinguishable from benign samples.
We propose a novel multi-view adversarial image detector, namely Argos, based on a novel observation.
Argos significantly outperforms two representative adversarial detectors in both detection accuracy and robustness.
- Score: 10.08837640910022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the evasion attacks against deep neural networks (DNN), the attacker
generates adversarial instances that are visually indistinguishable from benign
samples and sends them to the target DNN to trigger misclassifications. In this
paper, we propose a novel multi-view adversarial image detector, namely Argos,
based on a novel observation. That is, there exist two "souls" in an
adversarial instance, i.e., the visually unchanged content, which corresponds
to the true label, and the added invisible perturbation, which corresponds to
the misclassified label. Such inconsistencies could be further amplified
through an autoregressive generative approach that generates images with seed
pixels selected from the original image, a selected label, and pixel
distributions learned from the training data. The generated images (i.e., the
"views") will deviate significantly from the original one if the label is
adversarial, demonstrating inconsistencies that Argos expects to detect. To
this end, Argos first amplifies the discrepancies between the visual content of
an image and its misclassified label induced by the attack using a set of
regeneration mechanisms and then identifies an image as adversarial if the
reproduced views deviate to a preset degree. Our experimental results show that
Argos significantly outperforms two representative adversarial detectors in
both detection accuracy and robustness against six well-known adversarial
attacks. Code is available at:
https://github.com/sohaib730/Argos-Adversarial_Detection
Related papers
- Transparency Attacks: How Imperceptible Image Layers Can Fool AI
Perception [0.0]
This paper investigates a novel algorithmic vulnerability when imperceptible image layers confound vision models into arbitrary label assignments and captions.
We explore image preprocessing methods to introduce stealth transparency, which triggers AI misinterpretation of what the human eye perceives.
The stealth transparency confounds established vision systems, including evading facial recognition and surveillance, digital watermarking, content filtering, dataset curating, automotive and drone autonomy, forensic evidence tampering, and retail product misclassifying.
arXiv Detail & Related papers (2024-01-29T00:52:01Z) - I See Dead People: Gray-Box Adversarial Attack on Image-To-Text Models [0.0]
We present a gray-box adversarial attack on image-to-text, both untargeted and targeted.
Our attack operates in a gray-box manner, requiring no knowledge about the decoder module.
We also show that our attacks fool the popular open-source platform Hugging Face.
arXiv Detail & Related papers (2023-06-13T07:35:28Z) - Human-imperceptible, Machine-recognizable Images [76.01951148048603]
A major conflict is exposed relating to software engineers between better developing AI systems and distancing from the sensitive training data.
This paper proposes an efficient privacy-preserving learning paradigm, where images are encrypted to become human-imperceptible, machine-recognizable''
We show that the proposed paradigm can ensure the encrypted images have become human-imperceptible while preserving machine-recognizable information.
arXiv Detail & Related papers (2023-06-06T13:41:37Z) - Black-Box Attack against GAN-Generated Image Detector with Contrastive
Perturbation [0.4297070083645049]
We propose a new black-box attack method against GAN-generated image detectors.
A novel contrastive learning strategy is adopted to train the encoder-decoder network based anti-forensic model.
The proposed attack effectively reduces the accuracy of three state-of-the-art detectors on six popular GANs.
arXiv Detail & Related papers (2022-11-07T12:56:14Z) - Context-Aware Transfer Attacks for Object Detection [51.65308857232767]
We present a new approach to generate context-aware attacks for object detectors.
We show that by using co-occurrence of objects and their relative locations and sizes as context information, we can successfully generate targeted mis-categorization attacks.
arXiv Detail & Related papers (2021-12-06T18:26:39Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Benford's law: what does it say on adversarial images? [0.0]
We investigate statistical differences between natural images and adversarial ones.
We show that employing a proper image transformation and for a class of adversarial attacks, the distribution of the leading digit of the pixels in adversarial images deviates from Benford's law.
arXiv Detail & Related papers (2021-02-09T02:50:29Z) - Detecting Adversarial Examples by Input Transformations, Defense
Perturbations, and Voting [71.57324258813674]
convolutional neural networks (CNNs) have proved to reach super-human performance in visual recognition tasks.
CNNs can easily be fooled by adversarial examples, i.e., maliciously-crafted images that force the networks to predict an incorrect output.
This paper extensively explores the detection of adversarial examples via image transformations and proposes a novel methodology.
arXiv Detail & Related papers (2021-01-27T14:50:41Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Breaking certified defenses: Semantic adversarial examples with spoofed
robustness certificates [57.52763961195292]
We present a new attack that exploits not only the labelling function of a classifier, but also the certificate generator.
The proposed method applies large perturbations that place images far from a class boundary while maintaining the imperceptibility property of adversarial examples.
arXiv Detail & Related papers (2020-03-19T17:59:44Z) - Generating Semantic Adversarial Examples via Feature Manipulation [23.48763375455514]
We propose a more practical adversarial attack by designing structured perturbation with semantic meanings.
Our proposed technique manipulates the semantic attributes of images via the disentangled latent codes.
We demonstrate the existence of a universal, image-agnostic semantic adversarial example.
arXiv Detail & Related papers (2020-01-06T06:28:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.