Robust Evaluation of Diffusion-Based Adversarial Purification
- URL: http://arxiv.org/abs/2303.09051v3
- Date: Sun, 3 Dec 2023 19:26:11 GMT
- Title: Robust Evaluation of Diffusion-Based Adversarial Purification
- Authors: Minjong Lee, Dongwoo Kim
- Abstract summary: Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time.
White-box attacks are often employed to measure the robustness of the purification.
We propose a new purification strategy improving robustness compared to the current diffusion-based purification methods.
- Score: 3.634387981995277
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We question the current evaluation practice on diffusion-based purification
methods. Diffusion-based purification methods aim to remove adversarial effects
from an input data point at test time. The approach gains increasing attention
as an alternative to adversarial training due to the disentangling between
training and testing. Well-known white-box attacks are often employed to
measure the robustness of the purification. However, it is unknown whether
these attacks are the most effective for the diffusion-based purification since
the attacks are often tailored for adversarial training. We analyze the current
practices and provide a new guideline for measuring the robustness of
purification methods against adversarial attacks. Based on our analysis, we
further propose a new purification strategy improving robustness compared to
the current diffusion-based purification methods.
Related papers
- Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost [5.197034517903854]
We investigate a new test-time adversarial defense method via diffusion-based recovery along opposite adversarial paths (OAPs)
We present a purifier that can be plugged into a pre-trained model to resist adversarial attacks.
arXiv Detail & Related papers (2024-10-22T08:32:17Z) - Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled.
Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z) - Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.
In this paper, we argue that the inherentity in the DBP process is the primary driver of its robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z) - Scalable Ensemble-based Detection Method against Adversarial Attacks for
speaker verification [73.30974350776636]
This paper comprehensively compares mainstream purification techniques in a unified framework.
We propose an easy-to-follow ensemble approach that integrates advanced purification modules for detection.
arXiv Detail & Related papers (2023-12-14T03:04:05Z) - Purify++: Improving Diffusion-Purification with Advanced Diffusion
Models and Control of Randomness [22.87882885963586]
Defense against adversarial attacks is important for AI safety.
Adversarial purification is a family of approaches that defend adversarial attacks with suitable pre-processing.
We propose Purify++, a new diffusion purification algorithm that is now the state-of-the-art purification method against several adversarial attacks.
arXiv Detail & Related papers (2023-10-28T17:18:38Z) - Language Guided Adversarial Purification [3.9931474959554496]
Adversarial purification using generative models demonstrates strong adversarial defense performance.
New framework, Language Guided Adversarial Purification (LGAP), utilizing pre-trained diffusion models and caption generators.
arXiv Detail & Related papers (2023-09-19T06:17:18Z) - Unsupervised Adversarial Detection without Extra Model: Training Loss
Should Change [24.76524262635603]
Traditional approaches to adversarial training and supervised detection rely on prior knowledge of attack types and access to labeled training data.
We propose new training losses to reduce useless features and the corresponding detection method without prior knowledge of adversarial attacks.
The proposed method works well in all tested attack types and the false positive rates are even better than the methods good at certain types.
arXiv Detail & Related papers (2023-08-07T01:41:21Z) - Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.
We propose DiffPure that uses diffusion models for adversarial purification.
Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.