Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification
- URL: http://arxiv.org/abs/2509.13922v2
- Date: Fri, 19 Sep 2025 06:52:50 GMT
- Title: Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification
- Authors: Wenkui Yang, Jie Cao, Junxian Duan, Ran He,
- Abstract summary: Protective perturbations mitigate image misuse by injecting imperceptible adversarial noise.<n> purification can remove protective perturbations, thereby exposing images again to the risk of malicious forgery.<n>AntiPure embeds imperceptible perturbations that persist under representative purification settings, achieving effective post-customization distortion.
- Score: 20.862062527487794
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models like Stable Diffusion have become prominent in visual synthesis tasks due to their powerful customization capabilities, which also introduce significant security risks, including deepfakes and copyright infringement. In response, a class of methods known as protective perturbation emerged, which mitigates image misuse by injecting imperceptible adversarial noise. However, purification can remove protective perturbations, thereby exposing images again to the risk of malicious forgery. In this work, we formalize the anti-purification task, highlighting challenges that hinder existing approaches, and propose a simple diagnostic protective perturbation named AntiPure. AntiPure exposes vulnerabilities of purification within the "purification-customization" workflow, owing to two guidance mechanisms: 1) Patch-wise Frequency Guidance, which reduces the model's influence over high-frequency components in the purified image, and 2) Erroneous Timestep Guidance, which disrupts the model's denoising strategy across different timesteps. With additional guidance, AntiPure embeds imperceptible perturbations that persist under representative purification settings, achieving effective post-customization distortion. Experiments show that, as a stress test for purification, AntiPure achieves minimal perceptual discrepancy and maximal distortion, outperforming other protective perturbation methods within the purification-customization workflow.
Related papers
- Dual Attention Guided Defense Against Malicious Edits [70.17363183107604]
We propose a Dual Attention-Guided Noise Perturbation (DANP) immunization method that adds imperceptible perturbations to disrupt the model's semantic understanding and generation process.<n>Our method exhibits impressive immunity against malicious edits, and extensive experiments confirm that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-12-16T12:01:28Z) - Fragile by Design: On the Limits of Adversarial Defenses in Personalized Generation [26.890796322896346]
Defense mechanisms like Anti-DreamBooth attempt to mitigate the risk of facial identity leakage.<n>We identify two critical yet overlooked limitations of these methods.<n>Results reveal that none of the current methods maintains their protective effectiveness under such threats.
arXiv Detail & Related papers (2025-11-13T14:56:25Z) - NAPPure: Adversarial Purification for Robust Image Classification under Non-Additive Perturbations [51.835201929946294]
We propose an extended adversarial purification framework named NAPPure, which can handle non-additive perturbations.<n>Experiments on GTSRB and CIFAR-10 datasets show that NAPPure significantly boosts the robustness of image classification models against non-additive perturbations.
arXiv Detail & Related papers (2025-10-15T19:05:59Z) - Disruptive Attacks on Face Swapping via Low-Frequency Perceptual Perturbations [9.303194368381586]
Deepfake technology, driven by Generative Adversarial Networks (GANs), poses significant risks to privacy and societal security.<n>Existing detection methods are predominantly passive, focusing on post-event analysis without preventing attacks.<n>We propose an active defense method based on low-frequency perturbations to disrupt face swapping manipulation.
arXiv Detail & Related papers (2025-08-28T09:34:53Z) - Active Adversarial Noise Suppression for Image Forgery Localization [56.98050814363447]
We introduce an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise.<n>To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks.
arXiv Detail & Related papers (2025-06-15T14:53:27Z) - Anti-Inpainting: A Proactive Defense Approach against Malicious Diffusion-based Inpainters under Unknown Conditions [14.34509668877061]
Anti-Inpainting is a proactive defense approach that achieves protection comprising three novel modules.<n>First, we introduce a multi-level deep feature extractor to obtain intricate features from the diffusion denoising process.<n>Second, we design a multi-scale, semantic-preserving data augmentation technique to enhance the transferability of adversarial perturbations.
arXiv Detail & Related papers (2025-05-19T12:07:29Z) - Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification [75.09791002021947]
Existing purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples.<n>This approach is fundamentally flawed as the uniform operation of the forward process compromises normal pixels while attempting to combat adversarial perturbations.<n>We propose a heterogeneous purification strategy grounded in the interpretability of neural networks.<n>Our method decisively applies higher-intensity noise to specific pixels that the target model focuses on while the remaining pixels are subjected to only low-intensity noise.
arXiv Detail & Related papers (2025-03-03T11:00:25Z) - CopyrightShield: Enhancing Diffusion Model Security against Copyright Infringement Attacks [61.06621533874629]
Diffusion models are vulnerable to copyright infringement attacks, where attackers inject strategically modified non-infringing images into the training set.<n>We first propose a defense framework, CopyrightShield, to defend against the above attack.<n> Experimental results demonstrate that CopyrightShield significantly improves poisoned sample detection performance across two attack scenarios.
arXiv Detail & Related papers (2024-12-02T14:19:44Z) - Instant Adversarial Purification with Adversarial Consistency Distillation [1.3165428727965363]
One Step Control Purification (OSCP) is a novel defense framework that achieves robust adversarial purification in a single Neural Function Evaluation.<n>Our experimental results on ImageNet showcase OSCP's superior performance, achieving a 74.19% defense success rate with merely 0.1s per purification.
arXiv Detail & Related papers (2024-08-30T07:49:35Z) - Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - MalPurifier: Enhancing Android Malware Detection with Adversarial Purification against Evasion Attacks [18.016148305499865]
MalPurifier is a novel adversarial purification framework specifically engineered for Android malware detection.<n>Experiments on two large-scale datasets demonstrate that MalPurifier significantly outperforms state-of-the-art defenses.<n>As a lightweight, model-agnostic, and plug-and-play module, MalPurifier offers a practical and effective solution to bolster the security of ML-based Android malware detectors.
arXiv Detail & Related papers (2023-12-11T14:48:43Z) - Adversarial Purification of Information Masking [8.253834429336656]
Adrial attacks generate minuscule, imperceptible perturbations to images to deceive neural networks.
Counteracting these, adversarial purification methods seek to transform adversarial input samples into clean output images to defend against adversarial attacks.
We propose a novel adversarial purification approach named Information Mask Purification (IMPure) to extensively eliminate adversarial perturbations.
arXiv Detail & Related papers (2023-11-26T15:50:19Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.