Related papers: Denoising Diffusion Probabilistic Models as a Defense against Adversarial Attacks

Denoising Diffusion Probabilistic Models as a Defense against Adversarial Attacks

URL: http://arxiv.org/abs/2301.06871v1
Date: Tue, 17 Jan 2023 13:27:53 GMT
Title: Denoising Diffusion Probabilistic Models as a Defense against Adversarial Attacks
Authors: Lars Lien Ankile, Anna Midgley, Sebastian Weisshaar
Abstract summary: This project evaluates the performance of Denoising Diffusion Probabilistic Models (DDPM) as a purification technique to defend against adversarial attacks. We evaluate the approach on the PatchCamelyon data set for histopathologic scans of lymph node sections and find an improvement of the robust accuracy by up to 88% of the original model's accuracy.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural Networks are infamously sensitive to small perturbations in their inputs, making them vulnerable to adversarial attacks. This project evaluates the performance of Denoising Diffusion Probabilistic Models (DDPM) as a purification technique to defend against adversarial attacks. This works by adding noise to an adversarial example before removing it through the reverse process of the diffusion model. We evaluate the approach on the PatchCamelyon data set for histopathologic scans of lymph node sections and find an improvement of the robust accuracy by up to 88\% of the original model's accuracy, constituting a considerable improvement over the vanilla model and our baselines. The project code is located at https://github.com/ankile/Adversarial-Diffusion.

Related papers

Defending Against Frequency-Based Attacks with Diffusion Models [0.23020018305241333]
Diffusion models have proven highly effective for noise purification, not only in countering pixel-wise adversarial perturbations but also in addressing non-adversarial data shifts. Our findings highlight its effectiveness in handling diverse distortion patterns across low- to high-frequency regions.
arXiv Detail & Related papers (2025-04-15T09:57:17Z)
Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification [75.09791002021947]
Existing purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples. This approach is fundamentally flawed as the uniform operation of the forward process compromises normal pixels while attempting to combat adversarial perturbations. We propose a heterogeneous purification strategy grounded in the interpretability of neural networks. Our method decisively applies higher-intensity noise to specific pixels that the target model focuses on while the remaining pixels are subjected to only low-intensity noise.
arXiv Detail & Related papers (2025-03-03T11:00:25Z)
Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations. diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples. We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z)
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors [36.07978634674072]
Diffusion models are vulnerable to backdoor attacks that compromise their integrity. We propose TERD, a backdoor defense framework that builds unified modeling for current attacks. TERD secures a 100% True Positive Rate (TPR) and True Negative Rate (TNR) across datasets of varying resolutions.
arXiv Detail & Related papers (2024-09-09T03:02:16Z)
Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks. We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary. Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z)
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation [20.262426487434393]
We present a regeneration approach where an estimate given by a predictive model is provided as a guide for further diffusion. We show that the proposed approach uses the predictive model to remove the vocalizing and breathing artifacts while producing very high quality samples.
arXiv Detail & Related papers (2022-12-22T16:35:42Z)
How to Backdoor Diffusion Models? [74.43215520371506]
This paper presents the first study on the robustness of diffusion models against backdoor attacks. We propose BadDiffusion, a novel attack framework that engineers compromised diffusion processes during model training for backdoor implantation. Our results call attention to potential risks and possible misuse of diffusion models.
arXiv Detail & Related papers (2022-12-11T03:44:38Z)
Ada3Diff: Defending against 3D Adversarial Point Clouds via Adaptive Diffusion [70.60038549155485]
Deep 3D point cloud models are sensitive to adversarial attacks, which poses threats to safety-critical applications such as autonomous driving. This paper introduces a novel distortion-aware defense framework that can rebuild the pristine data distribution with a tailored intensity estimator and a diffusion model.
arXiv Detail & Related papers (2022-11-29T14:32:43Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks. We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP) On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z)
Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. We propose DiffPure that uses diffusion models for adversarial purification. Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.