Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection
- URL: http://arxiv.org/abs/2601.06639v1
- Date: Sat, 10 Jan 2026 17:49:08 GMT
- Title: Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection
- Authors: Qingyu Liu, Yitao Zhang, Zhongjie Ba, Chao Shuai, Peng Cheng, Tianhang Zheng, Zhibo Wang,
- Abstract summary: We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection.<n>We design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key.<n>Experiments show that PAI 98.43% verification accuracy, improving over SOTA methods by 37.25% on average, and retains strong tampering localization performance even against advanced AIGC edits.
- Score: 22.992750993168404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Protecting the copyright of user-generated AI images is an emerging challenge as AIGC becomes pervasive in creative workflows. Existing watermarking methods (1) remain vulnerable to real-world adversarial threats, often forced to trade off between defenses against spoofing and removal attacks; and (2) cannot support semantic-level tamper localization. We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection, plug-and-play with diffusion-based AIGC services. PAI simultaneously provides three key functionalities: robust ownership verification, attack detection, and semantic-level tampering localization. Unlike existing inherent watermark methods that only embed watermarks at noise initialization of diffusion models, we design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key. Such trajectory-level coupling further strengthens the semantic entanglement of identity and content, thereby further enhancing robustness against real-world threats. Moreover, we also provide a theoretical analysis proving that only the valid key can pass verification. Experiments across 12 attack methods show that PAI achieves 98.43\% verification accuracy, improving over SOTA methods by 37.25\% on average, and retains strong tampering localization performance even against advanced AIGC edits. Our code is available at https://github.com/QingyuLiu/PAI.
Related papers
- StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models [55.05404953041403]
We propose a novel framework that seamlessly integrates a binary watermark into the diffusion generation process.<n>We show that StableGuard consistently outperforms state-of-the-art methods in image fidelity, watermark verification, and tampering localization.
arXiv Detail & Related papers (2025-09-22T16:35:19Z) - Removal Attack and Defense on AI-generated Content Latent-based Watermarking [26.09708301315328]
Digital watermarks can be embedded into AI-generated content (AIGC) by initializing the generation process with starting points sampled from a secret distribution.<n>When combined with pseudorandom error-correcting codes, such watermarked outputs can remain indistinguishable from unwatermarked objects, while maintaining robustness under whitenoise.<n>We propose a novel attack that exploits boundary information leaked by the locations of watermarked objects.<n>This attack significantly reduces the distortion required to remove watermarks by up to a factor of $15 times$ compared to a baseline whitenoise attack under certain settings.
arXiv Detail & Related papers (2025-09-15T09:56:24Z) - Character-Level Perturbations Disrupt LLM Watermarks [64.60090923837701]
We formalize the system model for Large Language Model (LLM) watermarking.<n>We characterize two realistic threat models constrained on limited access to the watermark detector.<n>We demonstrate character-level perturbations are significantly more effective for watermark removal under the most restrictive threat model.<n> Experiments confirm the superiority of character-level perturbations and the effectiveness of the Genetic Algorithm (GA) in removing watermarks under realistic constraints.
arXiv Detail & Related papers (2025-09-11T02:50:07Z) - Uncovering and Mitigating Destructive Multi-Embedding Attacks in Deepfake Proactive Forensics [17.112388802067425]
proactive forensics involves embedding imperceptible watermarks to enable reliable source tracking.<n>Existing methods rely on an idealized assumption of single watermark embedding, which proves impractical in real-world scenarios.<n>We propose a general training paradigm named Adversarial Interference Simulation (AIS) to address the vulnerability.<n>Our method enables the model to maintain the ability to extract the original watermark correctly even after a second embedding.
arXiv Detail & Related papers (2025-08-24T07:57:32Z) - IConMark: Robust Interpretable Concept-Based Watermark For AI Images [50.045011844765185]
We propose IConMark, a novel in-generation robust semantic watermarking method.<n>IConMark embeds interpretable concepts into AI-generated images, making it resilient to adversarial manipulation.<n>We demonstrate its superiority in terms of detection accuracy and maintaining image quality.
arXiv Detail & Related papers (2025-07-17T05:38:30Z) - Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models [52.877452505561706]
We propose the first copyright evasion attack specifically designed to undermine dataset ownership verification (DOV)<n>Our CEAT2I comprises three stages: watermarked sample detection, trigger identification, and efficient watermark mitigation.<n>Our experiments show that our CEAT2I effectively evades DOV mechanisms while preserving model performance.
arXiv Detail & Related papers (2025-05-05T17:51:55Z) - WMCopier: Forging Invisible Image Watermarks on Arbitrary Images [38.59295440296696]
We propose WMCopier, an effective watermark forgery attack that operates without requiring prior knowledge of or access to the target watermarking algorithm.<n>Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image.<n> Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems.
arXiv Detail & Related papers (2025-03-28T11:11:19Z) - Robustness of Watermarking on Text-to-Image Diffusion Models [9.277492743469235]
We investigate the robustness of generative watermarking, which is created from the integration of watermarking embedding and text-to-image generation processing.
We found that generative watermarking methods are robust to direct evasion attacks, like discriminator-based attacks, or manipulation based on the edge information in edge prediction-based attacks but vulnerable to malicious fine-tuning.
arXiv Detail & Related papers (2024-08-04T13:59:09Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.