The Coding Limits of Robust Watermarking for Generative Models
- URL: http://arxiv.org/abs/2509.10577v1
- Date: Thu, 11 Sep 2025 18:08:32 GMT
- Title: The Coding Limits of Robust Watermarking for Generative Models
- Authors: Danilo Francati, Yevin Nikhel Goonatilake, Shubham Pawar, Daniele Venturi, Giuseppe Ateniese,
- Abstract summary: We prove a sharp threshold for the robustness of cryptographic watermarking for generative models.<n>We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding.
- Score: 8.828526790179673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove a sharp threshold for the robustness of cryptographic watermarking for generative models. This is achieved by introducing a coding abstraction, which we call messageless secret-key codes, that formalizes sufficient and necessary requirements of robust watermarking: soundness, tamper detection, and pseudorandomness. Thus, we establish that robustness has a precise limit: For binary outputs no scheme can survive if more than half of the encoded bits are modified, and for an alphabet of size q the corresponding threshold is $(1-1/q)$ of the symbols. Complementing this impossibility, we give explicit constructions that meet the bound up to a constant slack. For every ${\delta} > 0$, assuming pseudorandom functions and access to a public counter, we build linear-time codes that tolerate up to $(1/2)(1-{\delta})$ errors in the binary case and $(1-1/q)(1-{\delta})$ errors in the $q$-ary case. Together with the lower bound, these yield the maximum robustness achievable under standard cryptographic assumptions. We then test experimentally whether this limit appears in practice by looking at the recent watermarking for images of Gunn, Zhao, and Song (ICLR 2025). We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding from recovering the codeword, erasing the watermark while leaving the image visually intact. These results provide a complete characterization of robust watermarking, identifying the threshold at which robustness fails, constructions that achieve it, and an experimental confirmation that the threshold is already reached in practice.
Related papers
- Disappearing Ink: Obfuscation Breaks N-gram Code Watermarks in Theory and Practice [23.788321123219244]
Distinguishing AI-generated code from human-written code is crucial for authorship attribution, content tracking, and misuse detection.<n>N-gram-based watermarking schemes have emerged as prominent, which inject secret watermarks to be detected during the generation.<n>Most claims rely solely on defenses against simple code transformations or code optimizations as a simulation of attack, creating a questionable sense of robustness.
arXiv Detail & Related papers (2025-07-07T22:18:19Z) - Foundations of Top-$k$ Decoding For Language Models [19.73575905188064]
We develop a theoretical framework that both explains and generalizes top-$k$ decoding.<n>We show how to optimize it efficiently for a large class of divergences.
arXiv Detail & Related papers (2025-05-25T23:46:34Z) - Gaussian Shading++: Rethinking the Realistic Deployment Challenge of Performance-Lossless Image Watermark for Diffusion Models [66.54457339638004]
Copyright protection and inappropriate content generation pose challenges for the practical implementation of diffusion models.<n>We propose a diffusion model watermarking method tailored for real-world deployment.<n>Gaussian Shading++ not only maintains performance losslessness but also outperforms existing methods in terms of robustness.
arXiv Detail & Related papers (2025-04-21T11:18:16Z) - Robust Conformal Prediction with a Single Binary Certificate [58.450154976190795]
Conformal prediction (CP) converts any model's output to prediction sets with a guarantee to cover the true label with (adjustable) high probability.<n>We propose a robust conformal prediction that produces smaller sets even with significantly lower MC samples.
arXiv Detail & Related papers (2025-03-07T08:41:53Z) - Cloning Games, Black Holes and Cryptography [50.022147589030304]
We introduce a new toolkit for analyzing cloning games.<n>This framework allows us to analyze a new cloning game based on binary phase states.<n>We show that the binary phase variantally optimal bound offers quantitative insights into information scrambling in idealized models of black holes.
arXiv Detail & Related papers (2024-11-07T14:09:32Z) - Edit Distance Robust Watermarks via Indexing Pseudorandom Codes [29.69428894587431]
Motivated by the problem of detecting AI-generated text, we consider the problem of watermarking the output of language models with provable guarantees.<n>We aim for watermarks which satisfy: (a) undetectability, a cryptographic notion introduced by Christ, Gunn & Zamir (2024) and (b) robustness to channels which introduce a constant fraction of adversarial insertions, substitutions, and deletions.
arXiv Detail & Related papers (2024-06-04T04:03:17Z) - Pseudorandom Error-Correcting Codes [0.716879432974126]
We build pseudorandom codes that are robust to cryptographic substitution and deletion errors.
We present an undetectable watermarking scheme for outputs of random substitutions and deletions.
Our second application is to steganography, where a secret message is hidden in innocent-looking content.
arXiv Detail & Related papers (2024-02-14T18:17:45Z) - Estimating the Decoding Failure Rate of Binary Regular Codes Using Iterative Decoding [84.0257274213152]
We propose a new technique to provide accurate estimates of the DFR of a two-iterations (parallel) bit flipping decoder.<n>We validate our results, providing comparisons of the modeled and simulated weight of the syndrome, incorrectly-guessed error bit distribution at the end of the first iteration, and two-itcrypteration Decoding Failure Rates (DFR)
arXiv Detail & Related papers (2024-01-30T11:40:24Z) - Towards Optimal Statistical Watermarking [95.46650092476372]
We study statistical watermarking by formulating it as a hypothesis testing problem.
Key to our formulation is a coupling of the output tokens and the rejection region.
We characterize the Uniformly Most Powerful (UMP) watermark in the general hypothesis testing setting.
arXiv Detail & Related papers (2023-12-13T06:57:00Z) - Who Wrote this Code? Watermarking for Code Generation [53.24895162874416]
We propose Selective WatErmarking via Entropy Thresholding (SWEET) to detect machine-generated text.
Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines.
arXiv Detail & Related papers (2023-05-24T11:49:52Z) - The Stable Signature: Rooting Watermarks in Latent Diffusion Models [29.209892051477194]
This paper introduces an active strategy combining image watermarking and Latent Diffusion Models.
The goal is for all generated images to conceal an invisible watermark allowing for future detection and/or identification.
A pre-trained watermark extractor recovers the hidden signature from any generated image and a statistical test then determines whether it comes from the generative model.
arXiv Detail & Related papers (2023-03-27T17:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.