Adversarial Purification of Information Masking
- URL: http://arxiv.org/abs/2311.15339v1
- Date: Sun, 26 Nov 2023 15:50:19 GMT
- Title: Adversarial Purification of Information Masking
- Authors: Sitong Liu, Zhichao Lian, Shuangquan Zhang, Liang Xiao
- Abstract summary: Adrial attacks generate minuscule, imperceptible perturbations to images to deceive neural networks.
Counteracting these, adversarial purification methods seek to transform adversarial input samples into clean output images to defend against adversarial attacks.
We propose a novel adversarial purification approach named Information Mask Purification (IMPure) to extensively eliminate adversarial perturbations.
- Score: 8.253834429336656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial attacks meticulously generate minuscule, imperceptible
perturbations to images to deceive neural networks. Counteracting these,
adversarial purification methods seek to transform adversarial input samples
into clean output images to defend against adversarial attacks. Nonetheless,
extent generative models fail to effectively eliminate adversarial
perturbations, yielding less-than-ideal purification results. We emphasize the
potential threat of residual adversarial perturbations to target models,
quantitatively establishing a relationship between perturbation scale and
attack capability. Notably, the residual perturbations on the purified image
primarily stem from the same-position patch and similar patches of the
adversarial sample. We propose a novel adversarial purification approach named
Information Mask Purification (IMPure), aims to extensively eliminate
adversarial perturbations. To obtain an adversarial sample, we first mask part
of the patches information, then reconstruct the patches to resist adversarial
perturbations from the patches. We reconstruct all patches in parallel to
obtain a cohesive image. Then, in order to protect the purified samples against
potential similar regional perturbations, we simulate this risk by randomly
mixing the purified samples with the input samples before inputting them into
the feature extraction network. Finally, we establish a combined constraint of
pixel loss and perceptual loss to augment the model's reconstruction
adaptability. Extensive experiments on the ImageNet dataset with three
classifier models demonstrate that our approach achieves state-of-the-art
results against nine adversarial attack methods. Implementation code and
pre-trained weights can be accessed at
\textcolor{blue}{https://github.com/NoWindButRain/IMPure}.
Related papers
- Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - Imperceptible Face Forgery Attack via Adversarial Semantic Mask [59.23247545399068]
We propose an Adversarial Semantic Mask Attack framework (ASMA) which can generate adversarial examples with good transferability and invisibility.
Specifically, we propose a novel adversarial semantic mask generative model, which can constrain generated perturbations in local semantic regions for good stealthiness.
arXiv Detail & Related papers (2024-06-16T10:38:11Z) - Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models! [52.0855711767075]
EvoSeed is an evolutionary strategy-based algorithmic framework for generating photo-realistic natural adversarial samples.
We employ CMA-ES to optimize the search for an initial seed vector, which, when processed by the Conditional Diffusion Model, results in the natural adversarial sample misclassified by the Model.
Experiments show that generated adversarial images are of high image quality, raising concerns about generating harmful content bypassing safety classifiers.
arXiv Detail & Related papers (2024-02-07T09:39:29Z) - LFAA: Crafting Transferable Targeted Adversarial Examples with
Low-Frequency Perturbations [25.929492841042666]
We present a novel approach to generate transferable targeted adversarial examples.
We exploit the vulnerability of deep neural networks to perturbations on high-frequency components of images.
Our proposed approach significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-10-31T04:54:55Z) - IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks [16.577595936609665]
We introduce a novel approach to counter adversarial attacks, namely, image resampling.
Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation.
We show that our method significantly enhances the adversarial robustness of diverse deep models against various attacks while maintaining high accuracy on clean images.
arXiv Detail & Related papers (2023-10-18T11:19:32Z) - Adversarial Pixel Restoration as a Pretext Task for Transferable
Perturbations [54.1807206010136]
Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models.
We propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch.
Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective.
arXiv Detail & Related papers (2022-07-18T17:59:58Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z) - Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.
We propose DiffPure that uses diffusion models for adversarial purification.
Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z) - Robust Face Verification via Disentangled Representations [20.393894616979402]
We introduce a robust algorithm for face verification, deciding whether twoimages are of the same person or not.
We use the generativemodel during training as an online augmentation method instead of a test-timepurifier that removes adversarial noise.
We experimentally show that, when coupled with adversarial training, the proposed scheme converges with aweak inner solver and has a higher clean and robust accuracy than state-of-the-art-methods when evaluated against white-box physical attacks.
arXiv Detail & Related papers (2020-06-05T19:17:02Z) - Detecting Patch Adversarial Attacks with Image Residuals [9.169947558498535]
A discriminator is trained to distinguish between clean and adversarial samples.
We show that the obtained residuals act as a digital fingerprint for adversarial attacks.
Results show that the proposed detection method generalizes to previously unseen, stronger attacks.
arXiv Detail & Related papers (2020-02-28T01:28:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.