Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models
- URL: http://arxiv.org/abs/2503.22330v1
- Date: Fri, 28 Mar 2025 11:11:19 GMT
- Title: Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models
- Authors: Ziping Dong, Chao Shuai, Zhongjie Ba, Peng Cheng, Zhan Qin, Qinglong Wang, Kui Ren,
- Abstract summary: This paper proposes DiffForge, the first watermark forgery framework capable of forging imperceptible watermarks under a no-box setting.<n>We estimate the watermark distribution using an unconditional diffusion model and introduce shallow inversion to inject the watermark into a non-watermarked image seamlessly.<n> Comprehensive evaluations show that DiffForge deceives open-source watermark detectors with a 96.38% success rate and misleads a commercial watermark system with over 97% success rate.
- Score: 21.17890218813236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Invisible watermarking is critical for content provenance and accountability in Generative AI. Although commercial companies have increasingly committed to using watermarks, the robustness of existing watermarking schemes against forgery attacks is understudied. This paper proposes DiffForge, the first watermark forgery framework capable of forging imperceptible watermarks under a no-box setting. We estimate the watermark distribution using an unconditional diffusion model and introduce shallow inversion to inject the watermark into a non-watermarked image seamlessly. This approach facilitates watermark injection while preserving image quality by adaptively selecting the depth of inversion steps, leveraging our key insight that watermarks degrade with added noise during the early diffusion phases. Comprehensive evaluations show that DiffForge deceives open-source watermark detectors with a 96.38% success rate and misleads a commercial watermark system with over 97% success rate, achieving high confidence.1 This work reveals fundamental security limitations in current watermarking paradigms.
Related papers
- Robust Watermarks Leak: Channel-Aware Feature Extraction Enables Adversarial Watermark Manipulation [21.41643665626451]
We propose an attack framework that extracts leakage of watermark patterns using a pre-trained vision model.
Unlike prior works requiring massive data or detector access, our method achieves both forgery and detection evasion with a single watermarked image.
Our work exposes the robustness-stealthiness paradox: current "robust" watermarks sacrifice security for distortion resistance, providing insights for future watermark design.
arXiv Detail & Related papers (2025-02-10T12:55:08Z) - ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization [15.570148419846175]
Existing watermarking methods face the challenge of balancing robustness and concealment.
This paper introduces a watermark hiding process to actively achieve concealment, thus allowing the embedding of stronger watermarks.
Experiments on various diffusion models demonstrate the watermark remains verifiable even under significant image tampering.
arXiv Detail & Related papers (2024-11-06T12:14:23Z) - An Undetectable Watermark for Generative Image Models [65.31658824274894]
We present the first undetectable watermarking scheme for generative image models.<n>In particular, an undetectable watermark does not degrade image quality under any efficiently computable metric.<n>Our scheme works by selecting the initial latents of a diffusion model using a pseudorandom error-correcting code.
arXiv Detail & Related papers (2024-10-09T18:33:06Z) - Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns.
Watermarking AI-generated content is a key technology to address these concerns.
We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z) - Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space [7.082806239644562]
Existing methods face the dilemma of image quality and watermark robustness.
Watermarks with superior image quality usually have inferior robustness against attacks such as blurring and JPEG compression.
We propose Latent Watermark, which injects and detects watermarks in the latent diffusion space.
arXiv Detail & Related papers (2024-03-30T03:19:50Z) - RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees [33.61946642460661]
This paper introduces a robust and agile watermark detection framework, dubbed as RAW.
We employ a classifier that is jointly trained with the watermark to detect the presence of the watermark.
We show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image.
arXiv Detail & Related papers (2024-01-23T22:00:49Z) - Robustness of AI-Image Detectors: Fundamental Limits and Practical
Attacks [47.04650443491879]
We analyze the robustness of various AI-image detectors including watermarking and deepfake detectors.
We show that watermarking methods are vulnerable to spoofing attacks where the attacker aims to have real images identified as watermarked ones.
arXiv Detail & Related papers (2023-09-29T18:30:29Z) - Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs.
It is possible to integrate watermarks without affecting the output probability distribution.
The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z) - Certified Neural Network Watermarks with Randomized Smoothing [64.86178395240469]
We propose a certifiable watermarking method for deep learning models.
We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold.
Our watermark is also empirically more robust compared to previous watermarking methods.
arXiv Detail & Related papers (2022-07-16T16:06:59Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.