Purify++: Improving Diffusion-Purification with Advanced Diffusion
Models and Control of Randomness
- URL: http://arxiv.org/abs/2310.18762v1
- Date: Sat, 28 Oct 2023 17:18:38 GMT
- Title: Purify++: Improving Diffusion-Purification with Advanced Diffusion
Models and Control of Randomness
- Authors: Boya Zhang, Weijian Luo, Zhihua Zhang
- Abstract summary: Defense against adversarial attacks is important for AI safety.
Adversarial purification is a family of approaches that defend adversarial attacks with suitable pre-processing.
We propose Purify++, a new diffusion purification algorithm that is now the state-of-the-art purification method against several adversarial attacks.
- Score: 22.87882885963586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks can mislead neural network classifiers. The defense
against adversarial attacks is important for AI safety. Adversarial
purification is a family of approaches that defend adversarial attacks with
suitable pre-processing. Diffusion models have been shown to be effective for
adversarial purification. Despite their success, many aspects of diffusion
purification still remain unexplored. In this paper, we investigate and improve
upon three limiting designs of diffusion purification: the use of an improved
diffusion model, advanced numerical simulation techniques, and optimal control
of randomness. Based on our findings, we propose Purify++, a new diffusion
purification algorithm that is now the state-of-the-art purification method
against several adversarial attacks. Our work presents a systematic exploration
of the limits of diffusion purification methods.
Related papers
- LoRID: Low-Rank Iterative Diffusion for Adversarial Purification [3.735798190358]
This work presents an information-theoretic examination of diffusion-based purification methods.
We introduce LoRID, a novel Low-Rank Iterative Diffusion purification method designed to remove adversarial perturbation with low intrinsic purification errors.
LoRID achieves superior robustness performance in CIFAR-10/100, CelebA-HQ, and ImageNet datasets under both white-box and black-box settings.
arXiv Detail & Related papers (2024-09-12T17:51:25Z) - Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - Diffusion-based Adversarial Purification for Intrusion Detection [0.6990493129893112]
crafted perturbations mislead ML models, enabling attackers to evade detection or trigger false alerts.
adversarial purification has emerged as a compelling solution, particularly with diffusion models showing promising results.
This paper demonstrates the effectiveness of diffusion models in purifying adversarial examples in network intrusion detection.
arXiv Detail & Related papers (2024-06-25T14:48:28Z) - Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models [65.30406788716104]
This work investigates the vulnerabilities of security-enhancing diffusion models.
We demonstrate that these models are highly susceptible to DIFF2, a simple yet effective backdoor attack.
Case studies show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models.
arXiv Detail & Related papers (2024-06-14T02:39:43Z) - Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.
In this paper, we argue that the inherentity in the DBP process is the primary driver of its robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z) - DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial
Purification [63.65630243675792]
Diffusion-based purification defenses leverage diffusion models to remove crafted perturbations of adversarial examples.
Recent studies show that even advanced attacks cannot break such defenses effectively.
We propose a unified framework DiffAttack to perform effective and efficient attacks against diffusion-based purification defenses.
arXiv Detail & Related papers (2023-10-27T15:17:50Z) - Robust Evaluation of Diffusion-Based Adversarial Purification [3.634387981995277]
Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time.
White-box attacks are often employed to measure the robustness of the purification.
We propose a new purification strategy improving robustness compared to the current diffusion-based purification methods.
arXiv Detail & Related papers (2023-03-16T02:47:59Z) - How to Backdoor Diffusion Models? [74.43215520371506]
This paper presents the first study on the robustness of diffusion models against backdoor attacks.
We propose BadDiffusion, a novel attack framework that engineers compromised diffusion processes during model training for backdoor implantation.
Our results call attention to potential risks and possible misuse of diffusion models.
arXiv Detail & Related papers (2022-12-11T03:44:38Z) - Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.
We propose DiffPure that uses diffusion models for adversarial purification.
Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.