Related papers: Instant Adversarial Purification with Adversarial Consistency Distillation

Instant Adversarial Purification with Adversarial Consistency Distillation

URL: http://arxiv.org/abs/2408.17064v3
Date: Fri, 21 Mar 2025 13:58:47 GMT
Title: Instant Adversarial Purification with Adversarial Consistency Distillation
Authors: Chun Tong Lei, Hon Ming Yam, Zhongliang Guo, Yifei Qian, Chun Pong Lau,
Abstract summary: One Step Control Purification (OSCP) is a novel defense framework that achieves robust adversarial purification in a single Neural Function Evaluation.<n>Our experimental results on ImageNet showcase OSCP's superior performance, achieving a 74.19% defense success rate with merely 0.1s per purification.
Score: 1.3165428727965363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks have revolutionized numerous fields with their exceptional performance, yet they remain susceptible to adversarial attacks through subtle perturbations. While diffusion-based purification methods like DiffPure offer promising defense mechanisms, their computational overhead presents a significant practical limitation. In this paper, we introduce One Step Control Purification (OSCP), a novel defense framework that achieves robust adversarial purification in a single Neural Function Evaluation (NFE) within diffusion models. We propose Gaussian Adversarial Noise Distillation (GAND) as the distillation objective and Controlled Adversarial Purification (CAP) as the inference pipeline, which makes OSCP demonstrate remarkable efficiency while maintaining defense efficacy. Our proposed GAND addresses a fundamental tension between consistency distillation and adversarial perturbation, bridging the gap between natural and adversarial manifolds in the latent space, while remaining computationally efficient through Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA, eliminating the high computational budget request from full parameter fine-tuning. The CAP guides the purification process through the unlearnable edge detection operator calculated by the input image as an extra prompt, effectively preventing the purified images from deviating from their original appearance when large purification steps are used. Our experimental results on ImageNet showcase OSCP's superior performance, achieving a 74.19% defense success rate with merely 0.1s per purification -- a 100-fold speedup compared to conventional approaches.

Related papers

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification [75.09791002021947]
Existing purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples. This approach is fundamentally flawed as the uniform operation of the forward process compromises normal pixels while attempting to combat adversarial perturbations. We propose a heterogeneous purification strategy grounded in the interpretability of neural networks. Our method decisively applies higher-intensity noise to specific pixels that the target model focuses on while the remaining pixels are subjected to only low-intensity noise.
arXiv Detail & Related papers (2025-03-03T11:00:25Z)
LoRID: Low-Rank Iterative Diffusion for Adversarial Purification [3.735798190358]
This work presents an information-theoretic examination of diffusion-based purification methods. We introduce LoRID, a novel Low-Rank Iterative Diffusion purification method designed to remove adversarial perturbation with low intrinsic purification errors. LoRID achieves superior robustness performance in CIFAR-10/100, CelebA-HQ, and ImageNet datasets under both white-box and black-box settings.
arXiv Detail & Related papers (2024-09-12T17:51:25Z)
Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks. We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary. Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z)
Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness [28.09748997491938]
We introduce Consistency Purification, an efficiency-effectiveness superior purifier compared to the previous work. The consistency model is a one-step generative model distilled from PF-ODE, thus can generate on-manifold purified images with a single network evaluation. Our comprehensive experiments demonstrate that our Consistency Purification framework achieves state-of-the-art certified robustness and efficiency compared to baseline methods.
arXiv Detail & Related papers (2024-06-30T08:34:35Z)
Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model. For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space. We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z)
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks. In this paper, we propose that the intrinsicity in the DBP process is the primary factor driving robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z)
MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model [8.695439655048634]
Diffusion-based adversarial purification focuses on using the diffusion model to generate a clean image against adversarial attacks. We propose MimicDiffusion, a new diffusion-based adversarial purification technique, that directly approximates the generative process of the diffusion model with the clean image as input. Experiments on three image datasets demonstrate that MimicDiffusion significantly performs better than the state-of-the-art baselines.
arXiv Detail & Related papers (2023-12-08T02:32:47Z)
Purify++: Improving Diffusion-Purification with Advanced Diffusion Models and Control of Randomness [22.87882885963586]
Defense against adversarial attacks is important for AI safety. Adversarial purification is a family of approaches that defend adversarial attacks with suitable pre-processing. We propose Purify++, a new diffusion purification algorithm that is now the state-of-the-art purification method against several adversarial attacks.
arXiv Detail & Related papers (2023-10-28T17:18:38Z)
DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification [63.65630243675792]
Diffusion-based purification defenses leverage diffusion models to remove crafted perturbations of adversarial examples. Recent studies show that even advanced attacks cannot break such defenses effectively. We propose a unified framework DiffAttack to perform effective and efficient attacks against diffusion-based purification defenses.
arXiv Detail & Related papers (2023-10-27T15:17:50Z)
Noise-Free Score Distillation [78.79226724549456]
Noise-Free Score Distillation (NFSD) process requires minimal modifications to the original SDS framework. We achieve more effective distillation of pre-trained text-to-image diffusion models while using a nominal CFG scale.
arXiv Detail & Related papers (2023-10-26T17:12:26Z)
Enhancing Adversarial Robustness via Score-Based Optimization [22.87882885963586]
Adversarial attacks have the potential to mislead deep neural network classifiers by introducing slight perturbations. We introduce a novel adversarial defense scheme named ScoreOpt, which optimize adversarial samples at test-time. Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both performance and robustness speed.
arXiv Detail & Related papers (2023-07-10T03:59:42Z)
Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks. We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP) On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z)
Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. We propose DiffPure that uses diffusion models for adversarial purification. Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)
Adversarial purification with Score-based generative models [56.88185136509654]
We propose a novel adversarial purification method based on an EBM trained with Denoising Score-Matching (DSM) We introduce a simple yet effective randomized purification scheme that injects random noises into images before purification. We show that our purification method is robust against various attacks and demonstrate its state-of-the-art performances.
arXiv Detail & Related papers (2021-06-11T04:35:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.