Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
- URL: http://arxiv.org/abs/2404.14309v2
- Date: Wed, 02 Oct 2024 16:28:38 GMT
- Title: Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
- Authors: Yiming Liu, Kezhao Liu, Yao Xiao, Ziyi Dong, Xiaogang Xu, Pengxu Wei, Liang Lin,
- Abstract summary: Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.
In this paper, we argue that the inherentity in the DBP process is the primary driver of its robustness.
- Score: 65.10019978876863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks. The efficacy of DBP has been attributed to the forward diffusion process, which narrows the distribution gap between clean and adversarial images through the addition of Gaussian noise. Although this explanation has some theoretical support, the significance of its contribution to robustness remains unclear. In this paper, we argue that the inherent stochasticity in the DBP process is the primary driver of its robustness. To explore this, we introduce a novel Deterministic White-Box (DW-box) evaluation protocol to assess robustness in the absence of stochasticity and to analyze the attack trajectories and loss landscapes. Our findings suggest that DBP models primarily leverage stochasticity to evade effective attack directions, and their ability to purify adversarial perturbations can be weak. To further enhance the robustness of DBP models, we introduce Adversarial Denoising Diffusion Training (ADDT), which incorporates classifier-guided adversarial perturbations into diffusion training, thereby strengthening the DBP models' ability to purify adversarial perturbations. Additionally, we propose Rank-Based Gaussian Mapping (RBGM) to make perturbations more compatible with diffusion models. Experimental results validate the effectiveness of ADDT. In conclusion, our study suggests that future research on DBP can benefit from the perspective of decoupling the stochasticity-based and purification-based robustness.
Related papers
- DBLP: Noise Bridge Consistency Distillation For Efficient And Reliable Adversarial Purification [0.0]
Diffusion Bridge Distillation for Purification (DBLP) is a novel and efficient diffusion-based framework for adversarial purification.<n>DBLP achieves robust accuracy, superior image quality, and around 0.2s inference time, marking a significant step toward real-time adversarial purification.
arXiv Detail & Related papers (2025-08-01T11:47:36Z) - Navigating Sparse Molecular Data with Stein Diffusion Guidance [48.21071466968102]
optimal control (SOC) has emerged as a principled framework for fine-tuning diffusion models.<n>A class of training-free approaches has been developed that guides diffusion models using off-the-shelf classifiers on predicted clean samples.<n>We propose a novel training-free guidance framework based on a surrogate optimal control objective.
arXiv Detail & Related papers (2025-07-07T21:14:27Z) - How Do Diffusion Models Improve Adversarial Robustness? [3.729242965449096]
We investigate how and how well diffusion models improve adversarial robustness.<n>We find that the purified images are heavily influenced by the internal randomness of diffusion models.<n>Our findings provide novel insights into the mechanisms underlying diffusion-based purification.
arXiv Detail & Related papers (2025-05-28T20:19:21Z) - Towards more transferable adversarial attack in black-box manner [1.1417805445492082]
Black-box attacks based on transferability have received significant attention due to their practical applicability in real-world scenarios.<n>Recent state-of-the-art approach DiffPGD has demonstrated enhanced transferability by employing diffusion-based adversarial purification models for adaptive attacks.<n>We propose a novel loss function coupled with a unique surrogate model to validate our hypothesis.
arXiv Detail & Related papers (2025-05-23T16:49:20Z) - A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation [55.53426007439564]
Estimating individualized treatment effects from observational data is a central challenge in causal inference.<n>In inverse probability weighting (IPW) is a well-established solution to this problem, but its integration into modern deep learning frameworks remains limited.<n>We propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation.
arXiv Detail & Related papers (2025-05-16T17:00:52Z) - Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification [20.15955997832192]
Diffusion-based purification (DBP) is a defense against adversarial examples (AEs)
We revisit this claim, focusing on gradient-based strategies that back-propagate the loss gradients through the defense.
We show that such an optimization method invalidates DBP's core foundations and restricts the purified outputs to a distribution over malicious samples instead.
arXiv Detail & Related papers (2024-11-25T17:30:32Z) - Instant Adversarial Purification with Adversarial Consistency Distillation [1.3165428727965363]
One Step Control Purification (OSCP) is a novel defense framework that achieves robust adversarial purification in a single Neural Function Evaluation.
Our experimental results on ImageNet showcase OSCP's superior performance, achieving a 74.19% defense success rate with merely 0.1s per purification.
arXiv Detail & Related papers (2024-08-30T07:49:35Z) - Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - ADBM: Adversarial diffusion bridge model for reliable adversarial purification [21.2538921336578]
Recently Diffusion-based Purification (DiffPure) has been recognized as an effective defense method against adversarial examples.
We find DiffPure which directly employs the original pre-trained diffusion models for adversarial purification to be suboptimal.
We propose a novel Adrialversa Diffusion Bridge Model, termed ADBM, which constructs a reverse bridge from diffused adversarial data back to its original clean examples.
arXiv Detail & Related papers (2024-08-01T06:26:05Z) - Diffusion-based Adversarial Purification for Intrusion Detection [0.6990493129893112]
crafted perturbations mislead ML models, enabling attackers to evade detection or trigger false alerts.
adversarial purification has emerged as a compelling solution, particularly with diffusion models showing promising results.
This paper demonstrates the effectiveness of diffusion models in purifying adversarial examples in network intrusion detection.
arXiv Detail & Related papers (2024-06-25T14:48:28Z) - Improving Adversarial Transferability by Stable Diffusion [36.97548018603747]
adversarial examples introduce imperceptible perturbations to benign samples, deceiving predictions.
Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving predictions.
We introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images.
arXiv Detail & Related papers (2023-11-18T09:10:07Z) - Enhancing Adversarial Robustness via Score-Based Optimization [22.87882885963586]
Adversarial attacks have the potential to mislead deep neural network classifiers by introducing slight perturbations.
We introduce a novel adversarial defense scheme named ScoreOpt, which optimize adversarial samples at test-time.
Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both performance and robustness speed.
arXiv Detail & Related papers (2023-07-10T03:59:42Z) - Reconstructing Graph Diffusion History from a Single Snapshot [87.20550495678907]
We propose a novel barycenter formulation for reconstructing Diffusion history from A single SnapsHot (DASH)
We prove that estimation error of diffusion parameters is unavoidable due to NP-hardness of diffusion parameter estimation.
We also develop an effective solver named DIffusion hiTting Times with Optimal proposal (DITTO)
arXiv Detail & Related papers (2023-06-01T09:39:32Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z) - Balancing detectability and performance of attacks on the control
channel of Markov Decision Processes [77.66954176188426]
We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs)
This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods.
arXiv Detail & Related papers (2021-09-15T09:13:10Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise.
Pre-processing methods may suffer from the robustness degradation effect.
A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model.
We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.