Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
- URL: http://arxiv.org/abs/2502.20924v1
- Date: Fri, 28 Feb 2025 10:27:14 GMT
- Title: Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
- Authors: Haonan An, Guang Hua, Zhengru Fang, Guowen Xu, Susanto Rahardja, Yuguang Fang,
- Abstract summary: Box-free watermarking can protect deep image-to-image models.<n>Box-free watermarking uses an encoder and a decoder, respectively, to embed into and extract from the model's output images invisible copyright marks.<n>In this paper, we reveal an overlooked vulnerability of the unprotected watermark decoder.<n>We propose the decoder gradient shield (DGS) as a protection layer in the decoder API to prevent gradient-based watermark removal.
- Score: 21.876772763530383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The intellectual property of deep image-to-image models can be protected by the so-called box-free watermarking. It uses an encoder and a decoder, respectively, to embed into and extract from the model's output images invisible copyright marks. Prior works have improved watermark robustness, focusing on the design of better watermark encoders. In this paper, we reveal an overlooked vulnerability of the unprotected watermark decoder which is jointly trained with the encoder and can be exploited to train a watermark removal network. To defend against such an attack, we propose the decoder gradient shield (DGS) as a protection layer in the decoder API to prevent gradient-based watermark removal with a closed-form solution. The fundamental idea is inspired by the classical adversarial attack, but is utilized for the first time as a defensive mechanism in the box-free model watermarking. We then demonstrate that DGS can reorient and rescale the gradient directions of watermarked queries and stop the watermark remover's training loss from converging to the level without DGS, while retaining decoder output image quality. Experimental results verify the effectiveness of proposed method. Code of paper will be made available upon acceptance.
Related papers
- Adversarial Shallow Watermarking [33.580351668272215]
We propose a novel watermarking framework to resist unknown distortions, namely Adversarial Shallow Watermarking (ASW)
ASW utilizes only a shallow decoder that is randomly parameterized and designed to be insensitive to distortions for watermarking extraction.
ASW achieves comparable results on known distortions and better robustness on unknown distortions.
arXiv Detail & Related papers (2025-04-28T07:12:20Z) - SEAL: Semantic Aware Image Watermarking [26.606008778795193]
We propose a novel watermarking method that embeds semantic information about the generated image directly into the watermark.
The key pattern can be inferred from the semantic embedding of the image using locality-sensitive hashing.
Our results suggest that content-aware watermarks can mitigate risks arising from image-generative models.
arXiv Detail & Related papers (2025-03-15T15:29:05Z) - Invisible Watermarks: Attacks and Robustness [0.3495246564946556]
We introduce novel improvements to watermarking robustness and minimize degradation on image quality during attack.<n>We propose a custom watermark remover network which preserves one of the watermarking modalities while completely removing the other during decoding.<n>Our evaluation suggests that 1) implementing the watermark remover model to preserve one of the watermark modalities when decoding the other modality slightly improves on the baseline performance, and that 2) LBA degrades the image significantly less compared to uniform blurring of the entire image.
arXiv Detail & Related papers (2024-12-17T03:50:13Z) - ESpeW: Robust Copyright Protection for LLM-based EaaS via Embedding-Specific Watermark [50.08021440235581]
Embeds as a Service (Eding) is emerging as a crucial role in AI applications.
Eding is vulnerable to model extraction attacks, highlighting the urgent need for copyright protection.
We propose a novel embedding-specific watermarking (ESpeW) mechanism to offer robust copyright protection for Eding.
arXiv Detail & Related papers (2024-10-23T04:34:49Z) - An undetectable watermark for generative image models [65.31658824274894]
We present the first undetectable watermarking scheme for generative image models.
In particular, an undetectable watermark does not degrade image quality under any efficiently computable metric.
Our scheme works by selecting the initial latents of a diffusion model using a pseudorandom error-correcting code.
arXiv Detail & Related papers (2024-10-09T18:33:06Z) - DeepEclipse: How to Break White-Box DNN-Watermarking Schemes [60.472676088146436]
We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes.
DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme.
Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
arXiv Detail & Related papers (2024-03-06T10:24:47Z) - Invisible Image Watermarks Are Provably Removable Using Generative AI [47.25747266531665]
Invisible watermarks safeguard images' copyrights by embedding hidden messages only detectable by owners.
We propose a family of regeneration attacks to remove these invisible watermarks.
The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image.
arXiv Detail & Related papers (2023-06-02T23:29:28Z) - Supervised GAN Watermarking for Intellectual Property Protection [33.827150843939094]
We propose a watermarking method for Generative Adversarial Networks (GANs)
The aim is to watermark the GAN model so that any image generated by the GAN contains an invisible watermark (signature)
Results show that our method can effectively embed an invisible watermark inside the generated images.
arXiv Detail & Related papers (2022-09-07T20:52:05Z) - Certified Neural Network Watermarks with Randomized Smoothing [64.86178395240469]
We propose a certifiable watermarking method for deep learning models.
We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold.
Our watermark is also empirically more robust compared to previous watermarking methods.
arXiv Detail & Related papers (2022-07-16T16:06:59Z) - Piracy-Resistant DNN Watermarking by Block-Wise Image Transformation
with Secret Key [15.483078145498085]
The proposed method embeds a watermark pattern in a model by using learnable transformed images.
It is piracy-resistant, so the original watermark cannot be overwritten by a pirated watermark.
The results show that it was resilient against fine-tuning and pruning attacks while maintaining a high watermark-detection accuracy.
arXiv Detail & Related papers (2021-04-09T08:21:53Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.