Related papers: Dual Attention Guided Defense Against Malicious Edits

Dual Attention Guided Defense Against Malicious Edits

URL: http://arxiv.org/abs/2512.14333v1
Date: Tue, 16 Dec 2025 12:01:28 GMT
Title: Dual Attention Guided Defense Against Malicious Edits
Authors: Jie Zhang, Shuai Dong, Shiguang Shan, Xilin Chen,
Abstract summary: We propose a Dual Attention-Guided Noise Perturbation (DANP) immunization method that adds imperceptible perturbations to disrupt the model's semantic understanding and generation process.<n>Our method exhibits impressive immunity against malicious edits, and extensive experiments confirm that our method achieves state-of-the-art performance.
Score: 70.17363183107604
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent progress in text-to-image diffusion models has transformed image editing via text prompts, yet this also introduces significant ethical challenges from potential misuse in creating deceptive or harmful content. While current defenses seek to mitigate this risk by embedding imperceptible perturbations, their effectiveness is limited against malicious tampering. To address this issue, we propose a Dual Attention-Guided Noise Perturbation (DANP) immunization method that adds imperceptible perturbations to disrupt the model's semantic understanding and generation process. DANP functions over multiple timesteps to manipulate both cross-attention maps and the noise prediction process, using a dynamic threshold to generate masks that identify text-relevant and irrelevant regions. It then reduces attention in relevant areas while increasing it in irrelevant ones, thereby misguides the edit towards incorrect regions and preserves the intended targets. Additionally, our method maximizes the discrepancy between the injected noise and the model's predicted noise to further interfere with the generation. By targeting both attention and noise prediction mechanisms, DANP exhibits impressive immunity against malicious edits, and extensive experiments confirm that our method achieves state-of-the-art performance.

Related papers

Towards Transferable Defense Against Malicious Image Edits [70.17363183107604]
Transferable Defense Against Malicious Image Edits (TDAE) is a novel bimodal framework that enhances image immunity against malicious edits.<n>We introduce FlatGrad Defense Mechanism (FDM), which incorporates gradient regularization into the adversarial objective.<n>For textual enhancement protection, we propose Dynamic Prompt Defense (DPD), which periodically refines text embeddings to align the editing outcomes of immunized images with those of the original images.
arXiv Detail & Related papers (2025-12-16T12:10:16Z)
NDM: A Noise-driven Detection and Mitigation Framework against Implicit Sexual Intentions in Text-to-Image Generation [41.058425895887616]
Text-to-image (T2I) models are vulnerable to generating inappropriate content.<n> implicit sexual prompts, often disguised as seemingly benign terms, can unexpectedly trigger sexual content.<n>We propose NDM, the first noise-driven detection and mitigation framework.
arXiv Detail & Related papers (2025-10-17T15:37:02Z)
Disruptive Attacks on Face Swapping via Low-Frequency Perceptual Perturbations [9.303194368381586]
Deepfake technology, driven by Generative Adversarial Networks (GANs), poses significant risks to privacy and societal security.<n>Existing detection methods are predominantly passive, focusing on post-event analysis without preventing attacks.<n>We propose an active defense method based on low-frequency perturbations to disrupt face swapping manipulation.
arXiv Detail & Related papers (2025-08-28T09:34:53Z)
Active Adversarial Noise Suppression for Image Forgery Localization [56.98050814363447]
We introduce an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise.<n>To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks.
arXiv Detail & Related papers (2025-06-15T14:53:27Z)
A Knowledge-guided Adversarial Defense for Resisting Malicious Visual Manipulation [93.28532038721816]
Malicious applications of visual manipulation have raised serious threats to the security and reputation of users in many fields.<n>We propose a knowledge-guided adversarial defense (KGAD) to actively force malicious manipulation models to output semantically confusing samples.
arXiv Detail & Related papers (2025-04-11T10:18:13Z)
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization [22.225141381422873]
There is a growing concern about text-to-image diffusion models creating harmful content.<n>Post-hoc model intervention techniques, such as concept unlearning and safety guidance, have been developed to mitigate these risks.<n>We propose the safe generation framework Detect-and-Guide (DAG) to perform self-diagnosis and fine-interpret self-regulation.<n>DAG achieves state-of-the-art safe generation performance, balancing harmfulness mitigation and text-following performance on real-world prompts.
arXiv Detail & Related papers (2025-03-19T13:37:52Z)
MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation [39.12448251986432]
We propose Mutual Information-Guided Attack (MIGA) to directly attack deep denoising models.<n>MIGA strategically disrupts denoising models' ability to preserve semantic content via adversarial perturbations.<n>Our findings suggest that denoising models are not always robust and can introduce security risks in real-world applications.
arXiv Detail & Related papers (2025-03-10T06:26:34Z)
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing [103.40147707280585]
DiffusionGuard is a robust and effective defense method against unauthorized edits by diffusion-based image editing models.<n>We introduce a novel objective that generates adversarial noise targeting the early stage of the diffusion process.<n>We also introduce a mask-augmentation technique to enhance robustness against various masks during test time.
arXiv Detail & Related papers (2024-10-08T05:19:19Z)
Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks. We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP) On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.