Related papers: Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection

Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection

URL: http://arxiv.org/abs/2601.06639v1
Date: Sat, 10 Jan 2026 17:49:08 GMT
Title: Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection
Authors: Qingyu Liu, Yitao Zhang, Zhongjie Ba, Chao Shuai, Peng Cheng, Tianhang Zheng, Zhibo Wang,
Abstract summary: We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection.<n>We design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key.<n>Experiments show that PAI 98.43% verification accuracy, improving over SOTA methods by 37.25% on average, and retains strong tampering localization performance even against advanced AIGC edits.
Score: 22.992750993168404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Protecting the copyright of user-generated AI images is an emerging challenge as AIGC becomes pervasive in creative workflows. Existing watermarking methods (1) remain vulnerable to real-world adversarial threats, often forced to trade off between defenses against spoofing and removal attacks; and (2) cannot support semantic-level tamper localization. We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection, plug-and-play with diffusion-based AIGC services. PAI simultaneously provides three key functionalities: robust ownership verification, attack detection, and semantic-level tampering localization. Unlike existing inherent watermark methods that only embed watermarks at noise initialization of diffusion models, we design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key. Such trajectory-level coupling further strengthens the semantic entanglement of identity and content, thereby further enhancing robustness against real-world threats. Moreover, we also provide a theoretical analysis proving that only the valid key can pass verification. Experiments across 12 attack methods show that PAI achieves 98.43\% verification accuracy, improving over SOTA methods by 37.25\% on average, and retains strong tampering localization performance even against advanced AIGC edits. Our code is available at https://github.com/QingyuLiu/PAI.

Related papers

StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models [55.05404953041403]
We propose a novel framework that seamlessly integrates a binary watermark into the diffusion generation process.<n>We show that StableGuard consistently outperforms state-of-the-art methods in image fidelity, watermark verification, and tampering localization.
arXiv Detail & Related papers (2025-09-22T16:35:19Z)
Removal Attack and Defense on AI-generated Content Latent-based Watermarking [26.09708301315328]
Digital watermarks can be embedded into AI-generated content (AIGC) by initializing the generation process with starting points sampled from a secret distribution.<n>When combined with pseudorandom error-correcting codes, such watermarked outputs can remain indistinguishable from unwatermarked objects, while maintaining robustness under whitenoise.<n>We propose a novel attack that exploits boundary information leaked by the locations of watermarked objects.<n>This attack significantly reduces the distortion required to remove watermarks by up to a factor of $15 times$ compared to a baseline whitenoise attack under certain settings.
arXiv Detail & Related papers (2025-09-15T09:56:24Z)
Character-Level Perturbations Disrupt LLM Watermarks [64.60090923837701]
We formalize the system model for Large Language Model (LLM) watermarking.<n>We characterize two realistic threat models constrained on limited access to the watermark detector.<n>We demonstrate character-level perturbations are significantly more effective for watermark removal under the most restrictive threat model.<n> Experiments confirm the superiority of character-level perturbations and the effectiveness of the Genetic Algorithm (GA) in removing watermarks under realistic constraints.
arXiv Detail & Related papers (2025-09-11T02:50:07Z)
Uncovering and Mitigating Destructive Multi-Embedding Attacks in Deepfake Proactive Forensics [17.112388802067425]
proactive forensics involves embedding imperceptible watermarks to enable reliable source tracking.<n>Existing methods rely on an idealized assumption of single watermark embedding, which proves impractical in real-world scenarios.<n>We propose a general training paradigm named Adversarial Interference Simulation (AIS) to address the vulnerability.<n>Our method enables the model to maintain the ability to extract the original watermark correctly even after a second embedding.
arXiv Detail & Related papers (2025-08-24T07:57:32Z)
IConMark: Robust Interpretable Concept-Based Watermark For AI Images [50.045011844765185]
We propose IConMark, a novel in-generation robust semantic watermarking method.<n>IConMark embeds interpretable concepts into AI-generated images, making it resilient to adversarial manipulation.<n>We demonstrate its superiority in terms of detection accuracy and maintaining image quality.
arXiv Detail & Related papers (2025-07-17T05:38:30Z)
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models [52.877452505561706]
We propose the first copyright evasion attack specifically designed to undermine dataset ownership verification (DOV)<n>Our CEAT2I comprises three stages: watermarked sample detection, trigger identification, and efficient watermark mitigation.<n>Our experiments show that our CEAT2I effectively evades DOV mechanisms while preserving model performance.
arXiv Detail & Related papers (2025-05-05T17:51:55Z)
WMCopier: Forging Invisible Image Watermarks on Arbitrary Images [38.59295440296696]
We propose WMCopier, an effective watermark forgery attack that operates without requiring prior knowledge of or access to the target watermarking algorithm.<n>Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image.<n> Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems.
arXiv Detail & Related papers (2025-03-28T11:11:19Z)
Robustness of Watermarking on Text-to-Image Diffusion Models [9.277492743469235]
We investigate the robustness of generative watermarking, which is created from the integration of watermarking embedding and text-to-image generation processing. We found that generative watermarking methods are robust to direct evasion attacks, like discriminator-based attacks, or manipulation based on the edge information in edge prediction-based attacks but vulnerable to malicious fine-tuning.
arXiv Detail & Related papers (2024-08-04T13:59:09Z)
Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources. We propose a safe and robust backdoor-based watermark injection technique. We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z)
Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack. We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.