Related papers: Robust Watermarks Leak: Channel-Aware Feature Extraction Enables Adversarial Watermark Manipulation

Robust Watermarks Leak: Channel-Aware Feature Extraction Enables Adversarial Watermark Manipulation

URL: http://arxiv.org/abs/2502.06418v1
Date: Mon, 10 Feb 2025 12:55:08 GMT
Title: Robust Watermarks Leak: Channel-Aware Feature Extraction Enables Adversarial Watermark Manipulation
Authors: Zhongjie Ba, Yitao Zhang, Peng Cheng, Bin Gong, Xinyu Zhang, Qinglong Wang, Kui Ren,
Abstract summary: We propose an attack framework that extracts leakage of watermark patterns using a pre-trained vision model.<n>Unlike prior works requiring massive data or detector access, our method achieves both forgery and detection evasion with a single watermarked image.<n>Our work exposes the robustness-stealthiness paradox: current "robust" watermarks sacrifice security for distortion resistance, providing insights for future watermark design.
Score: 21.41643665626451
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Watermarking plays a key role in the provenance and detection of AI-generated content. While existing methods prioritize robustness against real-world distortions (e.g., JPEG compression and noise addition), we reveal a fundamental tradeoff: such robust watermarks inherently improve the redundancy of detectable patterns encoded into images, creating exploitable information leakage. To leverage this, we propose an attack framework that extracts leakage of watermark patterns through multi-channel feature learning using a pre-trained vision model. Unlike prior works requiring massive data or detector access, our method achieves both forgery and detection evasion with a single watermarked image. Extensive experiments demonstrate that our method achieves a 60\% success rate gain in detection evasion and 51\% improvement in forgery accuracy compared to state-of-the-art methods while maintaining visual fidelity. Our work exposes the robustness-stealthiness paradox: current "robust" watermarks sacrifice security for distortion resistance, providing insights for future watermark design.

Related papers

When There Is No Decoder: Removing Watermarks from Stable Diffusion Models in a No-box Setting [37.85082375268253]
We study the robustness of model-specific watermarking, where watermark embedding is integrated with text-to-image generation.<n>We introduce three attack strategies: edge prediction-based, box blurring, and fine-tuning-based attacks in a no-box setting.<n>Our best-performing attack achieves a reduction in watermark detection accuracy to approximately 47.92%.
arXiv Detail & Related papers (2025-07-04T15:22:20Z)
Optimization-Free Universal Watermark Forgery with Regenerative Diffusion Models [50.73220224678009]
Watermarking can be used to verify the origin of synthetic images generated by artificial intelligence models.<n>Recent studies demonstrate the capability to forge watermarks from a target image onto cover images via adversarial techniques.<n>In this paper, we uncover a greater risk of an optimization-free and universal watermark forgery.<n>Our approach significantly broadens the scope of attacks, presenting a greater challenge to the security of current watermarking techniques.
arXiv Detail & Related papers (2025-06-06T12:08:02Z)
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models [52.877452505561706]
We propose the first copyright evasion attack specifically designed to undermine dataset ownership verification (DOV)<n>Our CEAT2I comprises three stages: watermarked sample detection, trigger identification, and efficient watermark mitigation.<n>Our experiments show that our CEAT2I effectively evades DOV mechanisms while preserving model performance.
arXiv Detail & Related papers (2025-05-05T17:51:55Z)
Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal [57.84348166457113]
We introduce a novel feature adapting framework that leverages the representation capacity of a pre-trained image inpainting model. Our approach bridges the knowledge gap between image inpainting and watermark removal by fusing information of the residual background content beneath watermarks into the inpainting backbone model. For relieving the dependence on high-quality watermark masks, we introduce a new training paradigm by utilizing coarse watermark masks to guide the inference process.
arXiv Detail & Related papers (2025-04-07T02:37:14Z)
Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models [21.17890218813236]
This paper proposes DiffForge, the first watermark forgery framework capable of forging imperceptible watermarks under a no-box setting. We estimate the watermark distribution using an unconditional diffusion model and introduce shallow inversion to inject the watermark into a non-watermarked image seamlessly. Comprehensive evaluations show that DiffForge deceives open-source watermark detectors with a 96.38% success rate and misleads a commercial watermark system with over 97% success rate.
arXiv Detail & Related papers (2025-03-28T11:11:19Z)
SEAL: Semantic Aware Image Watermarking [26.606008778795193]
We propose a novel watermarking method that embeds semantic information about the generated image directly into the watermark. The key pattern can be inferred from the semantic embedding of the image using locality-sensitive hashing. Our results suggest that content-aware watermarks can mitigate risks arising from image-generative models.
arXiv Detail & Related papers (2025-03-15T15:29:05Z)
LampMark: Proactive Deepfake Detection via Training-Free Landmark Perceptual Watermarks [7.965986856780787]
This paper introduces a novel training-free landmark perceptual watermark, LampMark for short. We first analyze the structure-sensitive characteristics of Deepfake manipulations and devise a secure and confidential transformation pipeline. We present an end-to-end watermarking framework that imperceptibly embeds and extracts watermarks concerning the images to be protected.
arXiv Detail & Related papers (2024-11-26T08:24:56Z)
Trigger-Based Fragile Model Watermarking for Image Transformation Networks [2.38776871944507]
In fragile watermarking, a sensitive watermark is embedded in an object in a manner such that the watermark breaks upon tampering. We introduce a novel, trigger-based fragile model watermarking system for image transformation/generation networks. Our approach, distinct from robust watermarking, effectively verifies the model's source and integrity across various datasets and attacks.
arXiv Detail & Related papers (2024-09-28T19:34:55Z)
Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns. Watermarking AI-generated content is a key technology to address these concerns. We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z)
RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees [33.61946642460661]
This paper introduces a robust and agile watermark detection framework, dubbed as RAW. We employ a classifier that is jointly trained with the watermark to detect the presence of the watermark. We show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image.
arXiv Detail & Related papers (2024-01-23T22:00:49Z)
WAVES: Benchmarking the Robustness of Image Watermarks [67.955140223443]
WAVES (Watermark Analysis Via Enhanced Stress-testing) is a benchmark for assessing image watermark robustness. We integrate detection and identification tasks and establish a standardized evaluation protocol comprised of a diverse range of stress tests. We envision WAVES as a toolkit for the future development of robust watermarks.
arXiv Detail & Related papers (2024-01-16T18:58:36Z)
Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks [47.04650443491879]
We analyze the robustness of various AI-image detectors including watermarking and deepfake detectors. We show that watermarking methods are vulnerable to spoofing attacks where the attacker aims to have real images identified as watermarked ones.
arXiv Detail & Related papers (2023-09-29T18:30:29Z)
Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources. We propose a safe and robust backdoor-based watermark injection technique. We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z)
Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective. We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations. Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.