Forging and Removing Latent-Noise Diffusion Watermarks Using a Single Image
- URL: http://arxiv.org/abs/2504.20111v1
- Date: Sun, 27 Apr 2025 18:29:26 GMT
- Title: Forging and Removing Latent-Noise Diffusion Watermarks Using a Single Image
- Authors: Anubhav Jain, Yuya Kobayashi, Naoki Murata, Yuhta Takida, Takashi Shibuya, Yuki Mitsufuji, Niv Cohen, Nasir Memon, Julian Togelius,
- Abstract summary: Previous watermarking schemes embed a secret key in the initial noise.<n>We propose a black-box adversarial attack without presuming access to the diffusion model weights.<n>We show that we can also apply a similar approach for watermark removal by learning perturbations to exit this region.
- Score: 24.513881574366046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Watermarking techniques are vital for protecting intellectual property and preventing fraudulent use of media. Most previous watermarking schemes designed for diffusion models embed a secret key in the initial noise. The resulting pattern is often considered hard to remove and forge into unrelated images. In this paper, we propose a black-box adversarial attack without presuming access to the diffusion model weights. Our attack uses only a single watermarked example and is based on a simple observation: there is a many-to-one mapping between images and initial noises. There are regions in the clean image latent space pertaining to each watermark that get mapped to the same initial noise when inverted. Based on this intuition, we propose an adversarial attack to forge the watermark by introducing perturbations to the images such that we can enter the region of watermarked images. We show that we can also apply a similar approach for watermark removal by learning perturbations to exit this region. We report results on multiple watermarking schemes (Tree-Ring, RingID, WIND, and Gaussian Shading) across two diffusion models (SDv1.4 and SDv2.0). Our results demonstrate the effectiveness of the attack and expose vulnerabilities in the watermarking methods, motivating future research on improving them.
Related papers
- SEAL: Semantic Aware Image Watermarking [26.606008778795193]
We propose a novel watermarking method that embeds semantic information about the generated image directly into the watermark.<n>The key pattern can be inferred from the semantic embedding of the image using locality-sensitive hashing.<n>Our results suggest that content-aware watermarks can mitigate risks arising from image-generative models.
arXiv Detail & Related papers (2025-03-15T15:29:05Z) - Hidden in the Noise: Two-Stage Robust Watermarking for Images [25.731533630250798]
We present a distortion-free watermarking method for images based on a diffusion model's initial noise.
detecting the watermark requires comparing the initial noise reconstructed for an image to all previously used initial noises.
We propose a two-stage watermarking framework for efficient detection.
arXiv Detail & Related papers (2024-12-05T22:50:42Z) - Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models [16.57738116313139]
We show that attackers can leverage unrelated models, even with different latent spaces and architectures, to perform powerful and realistic forgery attacks.<n>The first imprints a targeted watermark into real images by manipulating the latent representation of an arbitrary image in an unrelated LDM.<n>The second attack generates new images with the target watermark by inverting a watermarked image and re-generating it with an arbitrary prompt.
arXiv Detail & Related papers (2024-12-04T12:57:17Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Invisible Image Watermarks Are Provably Removable Using Generative AI [47.25747266531665]
Invisible watermarks safeguard images' copyrights by embedding hidden messages only detectable by owners.
We propose a family of regeneration attacks to remove these invisible watermarks.
The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image.
arXiv Detail & Related papers (2023-06-02T23:29:28Z) - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are
Invisible and Robust [55.91987293510401]
Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content.
We introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs.
Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed.
arXiv Detail & Related papers (2023-05-31T17:00:31Z) - Certified Neural Network Watermarks with Randomized Smoothing [64.86178395240469]
We propose a certifiable watermarking method for deep learning models.
We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold.
Our watermark is also empirically more robust compared to previous watermarking methods.
arXiv Detail & Related papers (2022-07-16T16:06:59Z) - Split then Refine: Stacked Attention-guided ResUNets for Blind Single
Image Visible Watermark Removal [69.92767260794628]
Previous watermark removal methods require to gain the watermark location from users or train a multi-task network to recover the background indiscriminately.
We propose a novel two-stage framework with a stacked attention-guided ResUNets to simulate the process of detection, removal and refinement.
We extensively evaluate our algorithm over four different datasets under various settings and the experiments show that our approach outperforms other state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-13T09:05:37Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.