Universal Image Immunization against Diffusion-based Image Editing via Semantic Injection
- URL: http://arxiv.org/abs/2602.14679v1
- Date: Mon, 16 Feb 2026 12:08:37 GMT
- Title: Universal Image Immunization against Diffusion-based Image Editing via Semantic Injection
- Authors: Chanhui Lee, Seunghyun Shin, Donggyu Choi, Hae-gon Jeon, Jeany Son,
- Abstract summary: We propose the first universal image immunization framework that generates a single, broadly applicable adversarial perturbation.<n>Inspired by universal adversarial perturbation (UAP) techniques used in targeted attacks, our method generates a UAP that embeds a semantic target into images to be protected.<n>Our approach effectively blocks malicious editing attempts by overwriting the original semantic content in the image via the UAP.
- Score: 29.203173410857914
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in diffusion models have enabled powerful image editing capabilities guided by natural language prompts, unlocking new creative possibilities. However, they introduce significant ethical and legal risks, such as deepfakes and unauthorized use of copyrighted visual content. To address these risks, image immunization has emerged as a promising defense against AI-driven semantic manipulation. Yet, most existing approaches rely on image-specific adversarial perturbations that require individual optimization for each image, thereby limiting scalability and practicality. In this paper, we propose the first universal image immunization framework that generates a single, broadly applicable adversarial perturbation specifically designed for diffusion-based editing pipelines. Inspired by universal adversarial perturbation (UAP) techniques used in targeted attacks, our method generates a UAP that embeds a semantic target into images to be protected. Simultaneously, it suppresses original content to effectively misdirect the model's attention during editing. As a result, our approach effectively blocks malicious editing attempts by overwriting the original semantic content in the image via the UAP. Moreover, our method operates effectively even in data-free settings without requiring access to training data or domain knowledge, further enhancing its practicality and broad applicability in real-world scenarios. Extensive experiments show that our method, as the first universal immunization approach, significantly outperforms several baselines in the UAP setting. In addition, despite the inherent difficulty of universal perturbations, our method also achieves performance on par with image-specific methods under a more restricted perturbation budget, while also exhibiting strong black-box transferability across different diffusion models.
Related papers
- Unpaired Image-to-Image Translation via a Self-Supervised Semantic Bridge [59.247871132422006]
Adversarial diffusion and diffusion-inversion methods have advanced unpaired image-to-image translation, but each faces key limitations.<n>We propose the Self-Supervised Semantic Bridge ( SSB), a versatile framework that integrates external semantic priors into diffusion bridge models.<n>Our key idea is to leverage self-supervised visual encoders to learn representations that are invariant to appearance changes but capture geometric structure.
arXiv Detail & Related papers (2026-02-18T18:05:00Z) - Towards Transferable Defense Against Malicious Image Edits [70.17363183107604]
Transferable Defense Against Malicious Image Edits (TDAE) is a novel bimodal framework that enhances image immunity against malicious edits.<n>We introduce FlatGrad Defense Mechanism (FDM), which incorporates gradient regularization into the adversarial objective.<n>For textual enhancement protection, we propose Dynamic Prompt Defense (DPD), which periodically refines text embeddings to align the editing outcomes of immunized images with those of the original images.
arXiv Detail & Related papers (2025-12-16T12:10:16Z) - Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity [79.10998560865444]
We argue that immunization success should be defined by the edited output either semantically mismatching the prompt or suffering substantial perceptual degradations.<n>We introduce the Immunization Success Rate (ISR), a novel metric designed to rigorously quantify true immunization efficacy for the first time.
arXiv Detail & Related papers (2025-12-16T11:34:48Z) - Latent Diffusion Unlearning: Protecting Against Unauthorized Personalization Through Trajectory Shifted Perturbations [18.024767641200064]
We propose a model-based perturbation strategy that operates within the latent space of diffusion models.<n>Our method alternates between denoising and inversion while modifying the starting point of the denoising trajectory: of diffusion models.<n>We validate our approach on four benchmark datasets to demonstrate robustness against state-of-the-art inversion attacks.
arXiv Detail & Related papers (2025-10-03T15:18:45Z) - A Unified Framework for Stealthy Adversarial Generation via Latent Optimization and Transferability Enhancement [72.3054292908678]
We propose a unified framework that seamlessly incorporates traditional transferability enhancement strategies into diffusion model-based adversarial example generation via image editing.<n>Our method won first place in the "1st Adversarial Attacks on Deepfake Detectors: A Challenge in the Era of AI-Generated Media" competition at ACM MM25.
arXiv Detail & Related papers (2025-06-30T09:59:09Z) - Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models [32.23201683108716]
We propose a novel strategy that exclusively employs image patches for attacks, thus preserving the integrity of the original text.
Our method leverages prior knowledge from diffusion models to enhance the authenticity and naturalness of the perturbations.
Comprehensive experiments conducted in a white-box setting for image-to-text scenarios reveal that our proposed method significantly outperforms existing techniques, achieving a 100% attack success rate.
arXiv Detail & Related papers (2024-10-07T10:06:01Z) - Pixel Is Not a Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models [9.905296922309157]
Diffusion Models have emerged as powerful generative models for high-quality image synthesis, with many subsequent image editing techniques based on them.<n>Previous works have attempted to safeguard images from diffusion-based editing by adding imperceptible perturbations.<n>Our work proposes a novel attack framework, AtkPDM, which exploits vulnerabilities in denoising UNets and a latent optimization strategy to enhance the naturalness of adversarial images.
arXiv Detail & Related papers (2024-08-21T17:56:34Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - IMPRESS: Evaluating the Resilience of Imperceptible Perturbations
Against Unauthorized Data Usage in Diffusion-Based Generative AI [52.90082445349903]
Diffusion-based image generation models can create artistic images that mimic the style of an artist or maliciously edit the original images for fake content.
Several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations.
In this work, we introduce a purification perturbation platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure.
arXiv Detail & Related papers (2023-10-30T03:33:41Z) - Raising the Cost of Malicious AI-Powered Image Editing [82.71990330465115]
We present an approach to mitigating the risks of malicious image editing posed by large diffusion models.
The key idea is to immunize images so as to make them resistant to manipulation by these models.
arXiv Detail & Related papers (2023-02-13T18:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.