Adversarial purification with Score-based generative models
- URL: http://arxiv.org/abs/2106.06041v1
- Date: Fri, 11 Jun 2021 04:35:36 GMT
- Title: Adversarial purification with Score-based generative models
- Authors: Jongmin Yoon, Sung Ju Hwang, Juho Lee
- Abstract summary: We propose a novel adversarial purification method based on an EBM trained with Denoising Score-Matching (DSM)
We introduce a simple yet effective randomized purification scheme that injects random noises into images before purification.
We show that our purification method is robust against various attacks and demonstrate its state-of-the-art performances.
- Score: 56.88185136509654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While adversarial training is considered as a standard defense method against
adversarial attacks for image classifiers, adversarial purification, which
purifies attacked images into clean images with a standalone purification
model, has shown promises as an alternative defense method. Recently, an
Energy-Based Model (EBM) trained with Markov-Chain Monte-Carlo (MCMC) has been
highlighted as a purification model, where an attacked image is purified by
running a long Markov-chain using the gradients of the EBM. Yet, the
practicality of the adversarial purification using an EBM remains questionable
because the number of MCMC steps required for such purification is too large.
In this paper, we propose a novel adversarial purification method based on an
EBM trained with Denoising Score-Matching (DSM). We show that an EBM trained
with DSM can quickly purify attacked images within a few steps. We further
introduce a simple yet effective randomized purification scheme that injects
random noises into images before purification. This process screens the
adversarial perturbations imposed on images by the random noises and brings the
images to the regime where the EBM can denoise well. We show that our
purification method is robust against various attacks and demonstrate its
state-of-the-art performances.
Related papers
- Instant Adversarial Purification with Adversarial Consistency Distillation [1.224954637705144]
We propose One Step Control Purification (OSCP), a diffusion-based purification model that can purify the adversarial image in one Neural Evaluation (NFE) in diffusion models.
We achieve defense success rate of 74.19% on ImageNet, only requiring 0.1s for each purification.
arXiv Detail & Related papers (2024-08-30T07:49:35Z) - Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - Adversarial Purification of Information Masking [8.253834429336656]
Adrial attacks generate minuscule, imperceptible perturbations to images to deceive neural networks.
Counteracting these, adversarial purification methods seek to transform adversarial input samples into clean output images to defend against adversarial attacks.
We propose a novel adversarial purification approach named Information Mask Purification (IMPure) to extensively eliminate adversarial perturbations.
arXiv Detail & Related papers (2023-11-26T15:50:19Z) - Carefully Blending Adversarial Training and Purification Improves Adversarial Robustness [1.2289361708127877]
CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for defences.
Our method improves by a significant margin the state-of-the-art for CIFAR-10, CIFAR-100, and TinyImageNet-200.
arXiv Detail & Related papers (2023-05-25T09:04:31Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z) - Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.
We propose DiffPure that uses diffusion models for adversarial purification.
Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z) - Stochastic Security: Adversarial Defense Using Long-Run Dynamics of
Energy-Based Models [82.03536496686763]
The vulnerability of deep networks to adversarial attacks is a central problem for deep learning from the perspective of both cognition and security.
We focus on defending naturally-trained classifiers using Markov Chain Monte Carlo (MCMC) sampling with an Energy-Based Model (EBM) for adversarial purification.
Our contributions are 1) an improved method for training EBM's with realistic long-run MCMC samples, 2) Expectation-Over-Transformation (EOT) defense that resolves theoretical ambiguities for defenses, and 3) state-of-the-art adversarial defense for naturally-trained classifiers and competitive defense.
arXiv Detail & Related papers (2020-05-27T17:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.