Related papers: Robust Evaluation of Diffusion-Based Adversarial Purification

Robust Evaluation of Diffusion-Based Adversarial Purification

URL: http://arxiv.org/abs/2303.09051v3
Date: Sun, 3 Dec 2023 19:26:11 GMT
Title: Robust Evaluation of Diffusion-Based Adversarial Purification
Authors: Minjong Lee, Dongwoo Kim
Abstract summary: Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time. White-box attacks are often employed to measure the robustness of the purification. We propose a new purification strategy improving robustness compared to the current diffusion-based purification methods.
Score: 3.634387981995277
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We question the current evaluation practice on diffusion-based purification methods. Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time. The approach gains increasing attention as an alternative to adversarial training due to the disentangling between training and testing. Well-known white-box attacks are often employed to measure the robustness of the purification. However, it is unknown whether these attacks are the most effective for the diffusion-based purification since the attacks are often tailored for adversarial training. We analyze the current practices and provide a new guideline for measuring the robustness of purification methods against adversarial attacks. Based on our analysis, we further propose a new purification strategy improving robustness compared to the current diffusion-based purification methods.

Related papers

FlowPure: Continuous Normalizing Flows for Adversarial Purification [1.4898667360408233]
adversarial purification has emerged as a promising defense strategy.<n>We propose FlowPure, a novel purification method based on Continuous Normalizing Flows (CNFs) trained with Conditional Flow Matching (CFM)<n>Our results show that FlowPure is a highly effective purifier but it also holds a strong potential for adversarial detection.
arXiv Detail & Related papers (2025-05-19T16:04:43Z)
Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification [75.09791002021947]
Existing purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples. This approach is fundamentally flawed as the uniform operation of the forward process compromises normal pixels while attempting to combat adversarial perturbations. We propose a heterogeneous purification strategy grounded in the interpretability of neural networks. Our method decisively applies higher-intensity noise to specific pixels that the target model focuses on while the remaining pixels are subjected to only low-intensity noise.
arXiv Detail & Related papers (2025-03-03T11:00:25Z)
Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost [5.197034517903854]
We investigate a new test-time adversarial defense method via diffusion-based recovery along opposite adversarial paths (OAPs) We present a purifier that can be plugged into a pre-trained model to resist adversarial attacks.
arXiv Detail & Related papers (2024-10-22T08:32:17Z)
Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks. It is quite beneficial and challenging to detect poisoned samples from a mixed dataset. We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z)
Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks. We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary. Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z)
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z)
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks. In this paper, we argue that the inherentity in the DBP process is the primary driver of its robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z)
Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification [73.30974350776636]
This paper comprehensively compares mainstream purification techniques in a unified framework. We propose an easy-to-follow ensemble approach that integrates advanced purification modules for detection.
arXiv Detail & Related papers (2023-12-14T03:04:05Z)
Purify++: Improving Diffusion-Purification with Advanced Diffusion Models and Control of Randomness [22.87882885963586]
Defense against adversarial attacks is important for AI safety. Adversarial purification is a family of approaches that defend adversarial attacks with suitable pre-processing. We propose Purify++, a new diffusion purification algorithm that is now the state-of-the-art purification method against several adversarial attacks.
arXiv Detail & Related papers (2023-10-28T17:18:38Z)
Language Guided Adversarial Purification [3.9931474959554496]
Adversarial purification using generative models demonstrates strong adversarial defense performance. New framework, Language Guided Adversarial Purification (LGAP), utilizing pre-trained diffusion models and caption generators.
arXiv Detail & Related papers (2023-09-19T06:17:18Z)
Unsupervised Adversarial Detection without Extra Model: Training Loss Should Change [24.76524262635603]
Traditional approaches to adversarial training and supervised detection rely on prior knowledge of attack types and access to labeled training data. We propose new training losses to reduce useless features and the corresponding detection method without prior knowledge of adversarial attacks. The proposed method works well in all tested attack types and the false positive rates are even better than the methods good at certain types.
arXiv Detail & Related papers (2023-08-07T01:41:21Z)
Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. We propose DiffPure that uses diffusion models for adversarial purification. Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.